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BIOSYNTHESIS OF MEDIUM CHAIN LENGTH 
POLYHYDROXYALKANOATES 

FIELD OF THE INVENTION 

The invention relates to the biosynthesis of polymers and more specifically to the 
biosynthesis of polyhydrpxyalkanoate polymers in plants. In particular, a transgenic plant 
producing peroxisome- or glyoxysome-targeted polyhydroxyalkanoate synthase resulting in 
the production of polyhydroxyalkanoate materials. 

BACKGROUND OF THE INVENTION 

PHAs are bacterial polyesters that accumulate in a wide variety of bacteria. These 
polymers have properties ranging from stiff and brittle plastics to rubber-like materials, and 
are biodegradable. Because of these properties, PHAs are an attractive source of 
nonpolluting plastics and elastomers. 

Currently, there are approximately a dozen biodegradable plastics in commercial use 
that possess properties suitable for producing a number of specialty and commodity 
products (Lindsay, Modern Plastics 2: 62 (1992)). One such biodegradable plastic in the 
polyhydroxyalkanoate (PHA) family that is commercially important is Biopol™, a random 
copolymer of 3-hydroxybutyrate (3HB) and 3-hydroxyvalerate (3HV). This bioplastic is 
used to produce biodegradable molded material (e.g., bottles), films, coatings, and in drug 
release applications. Biopol™ is produced via a fermentation process employing the 
bacterium Alcaligenes eutrophus (Byrom, Trends Biotechnol. 5: 246 (1987)). The current 
market price is $6-7/lb, and the annual production is 1,000 tons. By best estimates, this 
price can be reduced only about 2-fold via fermentation (Poirier et ah, Bio/Technology 13: 
142 (1995)). Competitive synthetic plastics such as polypropylene and polyethylene cost 
about 35-450/lb (Layman, Chem. & Eng. News, p. 10 (Oct 31, 1994). The annual global 
demand for polyethylene alone is about 37 million metric tons (Layman, Chem. & Eng. 
News, p. 10 (Oct. 31, 1994). It is therefore likely that the cost of producing P(3HB-co- 
3HV) by microbial fermentation will restrict its use to low-volume specialty applications. 
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Polyhydroxyalkanoate (PHA) is a family of polymers composed primarily of R-3- - 
hydroxyaikanoic acids (Anderson, A. J. & Dawes, E. A. Microbiol Rev. 54: 450-472. 
(1990); Steinbiichel, A. in Novel Biomaterials from Biological Sources, ed. Byrom, D. 
(MacMillan, New York), pp. 123-213. (1991); Poirier, Y. Nawrath, C. & Somerville, C. 
Bio/Technology 13: 143-150 (1995)). Polyhydroxybutyrate (PHB) is the most well 
characterized PHA. High molecular weight PHB is found as intracellular inclusions in a 
wide variety of bacteria (Steinbiichel, A. in Novel Biomaterials from Biological Sources, ed. 
Byrom, D. (MacMillan, New York), pp. 123-213. (1991)). In Alcaligenes eutrophus, PHB 
typically accumulates to 80% dry weight with inclusions being typically 0.2-1 |im in 
diameter. Small quantity of PHB oligomers of approximately 150 monomer units are also 
found associated with membranes of bacteria and eukaryotes, where they form channels 
permeable to calcium (Reusch, R. N., Can. J. Microbiol 41 (Suppl. 1): 50-54 (1995)). High 
molecular weight PHAs have the properties of thermoplastics and elastomers. Numerous 
bacteria and fungi can hydrolyze PHAs to monomers and oligomers, which are metabolized 
as a carbon source. PHAs have, thus, attracted attention as a potential source of renewable 
and biodegradable plastics and elastomers. PHB is a highly crystalline polymer with rather 
poor physical properties, being relatively stiff and brittle (de Koning, G., Can. J. Microbiol. 
41 (Suppl. 1): 303-309 (1995)). In contrast, PHA copolymers containing monomer units 
ranging from 3 to 5 carbons for short-chain-length PHA (SCL-PHA), or 6 to 14 carbons for 
medium-chain-length PHA (MCL-PHA), are less crystalline and more flexible polymers (de 
Koning, G., Can. J. Microbiol 41 (Suppl. 1): 303-309 (1995)). 

PHB has been produced in the plant Arabidopsis thaliana expressing the A. 
eutrophus PHB biosynthetic enzymes (Poirier, Y., et al., Science 256: 520-523 (1992); 
Nawrath, C, et al., Proc. Natl Acad Sci. U.S.A. 91: 12760-12764 (1994)). In plants 
expressing the PHB pathway in the plastids, leaves accumulated up to 14% PHB per gram 
dry weight (Nawrath, C, et al., Proc. Natl Acad Set U.S.A. 91: 12760-12764 (1994)). 
High-level synthesis of PHB in plants opened the possibility of utilizing agricultural crops 
as a suitable system for the production of PHAs on a large scale and at low cost (Poirier, Y. 
et al., Bio/Technology 13: 143-150 (1995); Poirier, Y., et al., FEMS Microbiol Rev. 103: 
237-246 (1992); Nawrath, C, et al. Molecular Breeding 1: 105-22 (1995)). PHB was also 
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shown to be synthesized in insect cells expressing a mutant fatty acid synthase (Williams, 
M. D., et al., Appl Environ. Microbiol 62: 2540-2546 (1996)), and in yeast expressing the 
A. eutrophus PHB synthase (Leaf, T. A., et al. Microbiol 142: 1 169-1 180 (1996)). 

A number of pseudomonads, including Pseudomonas putida and Pseudomonas 
aeruginosa, accumulate MCL-PHAs when cells are grown on alkanoic acids (Anderson, A. 
J. & Dawes, E. A. Microbiol Rev. 54: 450-472. (1990); Steinbttchel, A. in Novel 
Biomaterials from Biological Sources, ed. Byrom, D. (MacMillan, New York), pp. 123-213. 
(1991); Poirier, Y. Nawrath, C. & Somerville, C. Bio/Technology 13: 143-150 (1995)). The 
nature of the PHA produced is related to the substrate used for growth and is typically 
composed of monomers which are 2n carbons shorter than the substrate. These studies 
indicate that MCL-PHAs are synthesized by the PHA synthase from 3-hydroxyacyl-CoA 
intermediates generated by the P-oxidation of alkanoic acids (Huijberts, G. N. M., et al. 
Appl Environ. Microbiol 58: 536-544 (1992); Huijberts, G. N. M., et aL, J. Bacteriol 176: 
1661-1666(1994)). 

There exists a need for novel methods towards the biosynthesis of 
polyhydroxyalkanoate materials suitable for commercial applications. Towards this goal, 
this patent application discloses the materials and methods for the use of a peroxisome 
targeted polyhydroxyalkanoate synthase protein in the biosynthesis of 
polyhydroxyalkanoate polymers. Localization in the peroxisomes allow for the utilization 
of intermediates from the lipid P-oxidation pathway. Plants expressing a P. aeruginosa 
polyhydroxyalkanoate synthase modified for peroxisome targeting produce PHA containing 
saturated and unsaturated 3-hydroxyalkanoic acids ranging from 6 to 16 carbons. 
Polyhydroxyalkanoate granules are found within the glyoxysomes or leaf-type peroxisomes 
of dark-and light-grown plants, respectively, as well as in the vacuoles. 

SUMMARY OF THE INVENTION 

The invention is directed towards materials and methods for the biosynthesis of 
polyhydroxyalkanoate polymers. More particularly, a fusion protein comprising a 
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polyhydroxyalkanoate synthase protein subunit and a peroxisome targeting protein subunit * 
renders a host cell or plant capable of producing polyhydroxyalkanoate polymer materials. 

In one embodiment, the invention provides a non-naturally ocurring fusion protein 
comprising a peroxisome targeting protein subunit and a polyhydroxyalkanoate synthase 
5 protein subunit. Generally, the peroxisome targeting protein subunit and the 
polyhydroxyalkanoate synthase protein subunit may be any subunit suitable for participation 
in the invention. The peroxisome targeting subunit may be an N-terminal or C-terminal 
subunit. The N-terminal subunit is preferably PTS2. The C-terminal peroxisome targeting 
subunit preferably comprises a tripeptide. The first amino acid in the N-terminus to Cl- 
io terminus direction is preferably S, A, or P. The second amino acid in the N-terminus to C- 
terminus direction is preferably K, R, S, or H. The third amino acid in the N-terminus to C- 
terminus direction is L, M, I, or F. More preferably, the C-terminal peroxisome targeting 
subunit comprises ARM, SRM, SKL, ARL, SRL, PSI, or PRM. The peroxisome targeting 
subunit is preferably at least 70% identical to SEQ ID NO: 14, more preferably at least 80% 
15 identical to SEQ ID NO:14, even more preferably at least 90% identical to SEQ ID NO:14, 
and most preferably is SEQ ID NO: 14. The polyhydroxyalkanoate synthase protein subunit 
is preferably a Pseudomonas subunit, and more preferably a Pseudomonas aeruginosa 
subunit. The polyhydroxyalkanoate synthase protein subunit may preferably be either a 
PHAC1 or PHAC2 subunit The PHAC1 subunit is preferably at least 70% identical to SEQ 
20 ID NO:2, more preferably at least 80% identical to SEQ ID NO:2, even more preferably at 
least 90% identical to SEQ ID NO:2, and most preferably is SEQ ID NO:2. The PHAC2 
subunit is preferably at least 70% identical to SEQ ID NO:4, more preferably at least 80 /o 
identical to SEQ ID NO:4, even more preferably at least 90% identical to SEQ ID NO:4, 
and most preferably is SEQ ID NO:4. The fusion protein is preferably at least 70% identical 
25 to SEQ ID NO: 18 or SEQ ID NO:20, more preferably at least 80% identical to SEQ ID 
NO: 18 or SEQ ID NO:20, even more preferably at least 90% identical to SEQ ID NO: 18 or 
SEQ ID NO:20, and most preferably is SEQ ID NO: 1 8 or SEQ ID NO:20. 



In an alternative embodiment, the invention encompasses a nucleic acid segmem 
encoding a non-naturally occurring fusion protein. The nucleic acid segment preferably 
30 comprises a nucleic acid sequence encoding a peroxisome targeting protein subunit, and x 

-4- 
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nucleic acid sequence encoding a polyhydroxyalkanoate synthase protein subunit. The 
nucleic acid sequence encoding a peroxisome targeting protein subunit preferably comprises 
at least a 6 contiguous nucleic acid sequence from SEQ ID NO:13. The length of the 
contiguous nucleic acid sequence may be 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 

5 etcetera, 50, 51, 52, etcetera, 100, 101, 102, etcetera, up to and including the entire length of 
SEQ ID NO: 13. The nucleic acid sequence encoding a peroxisome targeting protein subunit 
is preferably at least 70% identical to SEQ ID NO:13, more preferably at least 80% identical 
to SEQ ID NO: 13, even more preferably at least 90% identical to SEQ ID NO: 13, and most 
preferably is SEQ ID NO: 13. The nucleic acid sequence encoding a peroxisome targeting 

10 protein subunit preferably hybridizes to SEQ ID NO: 13. The nucleic acid sequence 
encoding a polyhydroxyalkanoate synthase protein subunit preferably comprises at least a 6 
contiguous nucleic acid sequence from SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:15, or 
SEQ ID NO: 16. The length of the contiguous nucleic acid sequence may be 6, 7, 8, 9, 10, 
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, etcetera, 50, 51, 52, etcetera, 100, 101, 102, etcetera, 

15 up to and including the entire length of SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO:15, or 
SEQ ID NO: 16. The nucleic acid sequence encoding a polyhydroxyalkanoate synthase 
protein subunit is preferably at least 70% identical to SEQ ID NO: 1 , SEQ ID NO:3, SEQ ID 
NO: 15, or SEQ ID NO: 16, more preferably at least 80% identical to SEQ ID NO:l, SEQ ID 
NO:3, SEQ ID NO: 15, or SEQ ID NO: 16, even more preferably at least 90% identical to 

20 SEQ ID NO: 1 , SEQ ID NO:3, SEQ ID NO: 1 5, or SEQ ID NO: 1 6, further preferably is SEQ 
ID NO:l, SEQ ID NO:3, SEQ ID NO:15, or SEQ ID NO:16, and most preferably is SEQ ID 
NO:15 or SEQ ID NO:16. The nucleic acid sequence encoding a polyhydroxyalkanoate 
synthase protein subunit preferably hybridizes to SEQ ID NO:l, SEQ ID NO:3, SEQ ID 
NO: 15, or SEQ ID NO: 16. The encoded peroxisome targeting protein subunit may be anN- 

25 terminal or C-terminal peroxisome targeting protein subunit. The encoded N-terminal 
peroxisome targeting subunit is preferably PTS-2. The encoded C-terminal peroxisome 
targeting protein subunit preferably comprises a tripeptide. The tripeptide preferably 
comprises a first amino acid in the N-terminus to C-terminus direction being S, A, or P; a 
second amino acid in the N-terminus to C-terminus direction being K, R, S, or H; and a 

30 third amino acid in the N-terminus to C-terminus direction being L, M, I, or F. The encoded 
tripeptide preferably is ARM, SRM, SKL, ARL, SRL, PSI, or PRM. The nucleic acid 

-5- 
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sequence encoding a polyhydroxyalkanoate synthase protein subunit preferably encodes at ^ 
least a 5 contiguous amino acid sequence from SEQ ID NO:2 or SEQ ID NO:4. The length 
of the contiguous nucleic acid sequence may be 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 
19, 20, etcetera, 50, 51, 52, etcetera, 100, 101, 102, etcetera, up to and including the entire 
5 length of SEQ ID NO:2 or SEQ ID NO:4. The nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit preferably encodes an amino acid sequence 
at least 70% identical to SEQ ID NO:2 or SEQ ID NO:4, more preferably at least 80% 
identical to SEQ ID NO:2 or SEQ ID NO:4, even more preferably at least 90% identical to 
SEQ ID NO:2 or SEQ ID NO:4, and most preferably is SEQ ID NO:2 or SEQ ID NO:4. 

io In an alternative embodiment, the invention discloses a recombinant vector 

comprising in the 5' to 3* direction a) a promoter that directs transcription of a structural 
nucleic acid sequence encoding a non-naturally occurring fusion protein, wherein the fusion 
protein comprises a peroxisome targeting protein subunit and a polyhydroxyalkanoate 
synthase protein subunit, b) a structural nucleic acid sequence encoding a non-naturally 

is occurring fusion protein, wherein the fusion protein comprises a peroxisome targeting 
protein subunit and a polyhydroxyalkanoate synthase protein subunit, and c) a 3* 
transcription terminator. The recombinant vector may further comprise a 3* 
polyadenylation signal sequence that directs the addition of polyadenylate nucleotides to the 
3' end of RNA transcribed from the structural nucleic acid coding sequence. The^ 

20 recombinant vector may further comprise a selectable marker. The selectable marker may 
generally be any selectable marker suitable for the intended host cell or plant, and preferably 
is a kanamycin resistance marker, a hygromycin resistance marker, or a herbicide resistance 
marker. The promoter may be constitutive, inducible, tissue specific, or combinations 
thereof. The constitutive promoter may generally any constitutive promoter suitable for the 

25 intended host cell or plant, and preferably is CaMV35S, enhanced CaMV35S, FMV, mas, 
nos, or ocs. The inducible promoter may generally be any inducible promoter suitable for 
the intended host cell or plant, and preferably is tac, salicylic acid induced, polyacrylic acid 
induced, safener induced, heat shock promoter, nitrate induced, hormone induced, or light 
induced. The tissue specific promoter may generally be any tissue specific promoter 

30 suitable for the intended host cell or plant, and preferably is the (J-conglycinin 7S promoter, 

-6- 
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napin promoter, phaseolin promoter, zein promoter, soybean trypsin inhibitor promoter, 
ACP promoter, stearoyl-ACP desaturase promoter, or oleosin promoter. The nucleic acid 
sequence encoding a peroxisome targeting protein subunit preferably comprises at least a 6 
contiguous nucleic acid sequence from SEQ ID NO: 13. The length of the contiguous 
nucleic acid sequence may be 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, etcetera, 
50, 51, 52, etcetera, 100, 101, 102, etcetera, up to and including the entire length of SEQ ID 
NO: 13. The nucleic acid sequence encoding a peroxisome targeting protein subunit is 
preferably at least 70% identical to SEQ ID NO: 13, more preferably at least 80% identical 
to SEQ ID NO: 13, even more preferably at least 90% identical to SEQ ID NO: 13, and most 
preferably is SEQ ID NO: 13. The nucleic acid sequence encoding a peroxisome targeting 
protein subunit preferably hybridizes to SEQ ID NO:13. The encoded peroxisome targeting 
protein subunit may be an N-terminal or C-terminal peroxisome targeting protein subunit. 
The encoded N-terminal peroxisome targeting subunit is preferably PTS-2. The encoded C- 
terminal peroxisome targeting protein subunit preferably comprises a tripeptide. The 
tripeptide preferably comprises a first amino acid in the N-terminus to C-terminus direction 
being S, A, or P; a second amino acid in the N-terminus to C-terminus direction being K, R, 
S, or H; and a third amino acid in the N-terminus to C-terminus direction being L, M, I, or 
F. The encoded tripeptide preferably is ARM, SRM, SKL, ARL, SRL, PSI, or PRM. The 
encoded polyhydroxyalkanoate synthase protein subunit is preferably a Pseudomonas 
subunit, and more preferably is a Pseudomonas aeruginosa subunit. The nucleic acid 
sequence encoding a polyhydroxyalkanoate synthase protein subunit preferably comprises at 
least a 6 contiguous nucleic acid sequence from SEQ ID NO:l, SEQ ID NO:3, SEQ ID 
NO: 15, or SEQ ID NO: 16. The length of the contiguous nucleic acid sequence may be 6, 7, 
8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, etcetera, 50, 51, 52, etcetera, 100, 101, 102, 
etcetera, up to and including the entire length of SEQ ID NO:l, SEQ ID NO:3, SEQ ID 
NO: 15, or SEQ ID NO: 16. The nucleic acid sequence encoding a polyhydroxyalkanoate 
synthase protein subunit is preferably at least 70% identical to SEQ ID NO:l, SEQ ID 
NO:3, SEQ ID NO: 15, or SEQ ID NO: 16, more preferably at least 80% identical to SEQ ID 
NO:l, SEQ ID NO:3, SEQ ID NO: 15, or SEQ ID NO: 16, even more preferably at least 90% 
identical to SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO: 15, or SEQ ID NO: 16, further 
preferably is SEQ ID NO:l, SEQ ID NO:3, SEQ ID NO: 15, or SEQ ID NO: 16, and most 
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preferably is SEQ ID NO:15 or SEQ ID NO:16. The nucleic acid sequence encoding a 
poly hydroxy alkanoate synthase protein subunit preferably hybridizes to SEQ ID NO:l, SEQ 
ID NO:3, SEQ ID NO: 15, or SEQ ID NO: 16. The nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit preferably encodes at least a 5 contiguous 
amino acid sequence from SEQ ID NO:2 or SEQ ID NO:4. The length of the contiguous 
nucleic acid sequence may be 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, etcetera, 
50, 51, 52, etcetera, 100, 101, 102, etcetera, up to and including the entire length of SEQ ID 
NO:2 or SEQ ID NO:4. The nucleic acid sequence encoding a polyhydroxyalkanoate 
synthase protein subunit preferably encodes an amino acid sequence at least 70% identical 
to SEQ ID NO:2 or SEQ ID NO:4, more preferably at least 80% identical to SEQ ID NO:2 
or SEQ ID NO:4, even more preferably at least 90% identical to SEQ ID NO:2 or SEQ ID 
NO:4, and most preferably is SEQ ID NO:2 or SEQ ID NO:4. The structural nucleic acid 
sequence preferably comprises SEQ ID NO: 17 or SEQ ID NO: 19, and preferably encodes 
SEQ ID NO: 1 8 or SEQ ID NO:20. 

In an alternative embodiment, the invention encompasses a recombinant host cell 
comprising a nucleic acid segment encoding a non-naturally occurring fusion protein, 
wherein the nucleic acid segment comprises a nucleic acid sequence encoding a peroxisome 
targeting protein subunit and a nucleic acid sequence encoding a polyhydroxyalkanoate 
synthase protein subunit The recombinant host cell may generally be any type of host cell, 
and preferably is a fungal or plant host cell. The fungal cell is generally any type of fungal 
cell, and preferably a Schizosaccharomyces pombe, Streptomyces rimofaciens y Fusarium, 
Aspergillus niger, or Saccharomyces cerevisiae cell. The plant cell is generally any type of 
plant cell, and preferably an alfalfa, banana, barley, bean, cabbage, canola/oilseed rape, 
carrot, castorbean, celery, clover, coconut, corn, cotton, cucumber, linseed, melon, olive, 
palm, parsnip, pea, peanut, pepper, potato, potato, radish, rapeseed, rice, soybean, spinach, 
sunflower, tobacco, tomato, or wheat cell. The recombinant host cell may further comprise 
a nucleic acid segment encoding an acyl-ACP thioesterase, a fatty acyl hydroxylase, a yeast 
multifunctional protein (MFP), or an hydroxyacyl-CoA epimerase. 
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A further alternative embodiment describes a genetically transformed plant cell 
comprising in the 5* to 3' direction: a) a promoter to direct transcription of a structural 
nucleic acid sequence encoding a non-naturally occurring fusion protein, wherein the 
structural nucleic acid sequence comprises: i) a nucleic acid sequence encoding a 
peroxisome targeting protein subunit; and ii) a nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit; b) a structural nucleic acid sequence 
encoding a non-naturally occurring fusion, protein, wherein the structural nucleic acid 
sequence comprises: i) a nucleic acid sequence encoding a peroxisome targeting protein 
subunit; and ii) a nucleic acid sequence encoding a polyhydroxyalkanoate synthase protein 
subunit; c) a 3' transcription terminator sequence; and d) a 3' polyadenylation signal 
sequence that directs the addition of polyadenylate nucleotides to the 3' end of RNA 
transcribed from the structural nucleic acid coding sequence. The plant cell is generally any 
type of plant cell, and preferably an alfalfa, banana, barley, bean, cabbage, canola/oilseed 
rape, carrot, castorbean, celery, clover, coconut, com, cotton, cucumber, linseed, melon, 
olive, palm, parsnip, pea, peanut, pepper, potato, potato, radish, rapeseed, rice, soybean, 
spinach, sunflower, tobacco, tomato, or wheat cell. The plant cell may further comprise a 
nucleic acid segment encoding an acyl-ACP thioesterase, a fatty acyl hydroxylase, a yeast 
multifunctional protein (MFP), or an hydroxyacyl-CoA epimerase. 

An additional embodiment describes a genetically transformed plant comprising in 
the 5' to 3' direction: a) a promoter to direct transcription of a structural nucleic acid 
sequence encoding a non-naturally occurring fusion protein, wherein the structural nucleic 
acid sequence comprises: i) a nucleic acid sequence encoding a peroxisome targeting protein 
subunit; and ii) a nucleic acid sequence encoding a polyhydroxyalkanoate synthase protein 
subunit; b) a structural nucleic acid sequence encoding a non-naturally occurring fusion 
protein, wherein the structural nucleic acid sequence comprises: i) a nucleic acid sequence 
encoding a peroxisome targeting protein subunit; and ii) a nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit; c) a 3' transcription terminator sequence; 
and d) a 3' polyadenylation signal sequence that directs the addition of polyadenylate 
nucleotides to the 3' end of RNA transcribed from the structural nucleic acid coding 
sequence. The plant may generally be any type of plant, and preferably an alfalfa, banana, 
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barley, bean, cabbage, canola/oilseed rape, carrot, castorbean, celery, clover, coconut, corn, 
cotton, cucumber, linseed, melon, olive, palm, parsnip, pea, peanut, pepper, potato, potato, 
radish, rapeseed, rice, soybean, spinach, sunflower, tobacco, tomato, or wheat plant. The 
promoter may be constitutive, inducible, tissue specific, or combinations thereof. The 
constitutive promoter may generally any constitutive promoter suitable for the intended 
plant, and preferably is CaMV35S, enhanced CaMV35S, FMV, mas, nos, or ocs. The 
inducible promoter may generally be any inducible promoter suitable for the intended plant, 
and preferably is tac, salicylic acid induced, polyacrylic acid induced, safener induced, heat 
shock promoter, nitrate induced, hormone induced, or light induced. The tissue specific 
promoter is generally any tissue specific promoter, and preferably is the JJ-conglycinin 7S 
promoter, napin promoter, phaseolin promoter, zein promoter, soybean trypsin inhibitor 
promoter, ACP promoter, stearoyl-ACP desaturase promoter, or oleosin promoter. The 
plant may further comprise a nucleic acid segment encoding an acyl-ACP thioesterase, a 
fatty acyl hydroxylase, a yeast multifunctional protein (MFP), or an hydroxyacyl-CoA 
epimerase. 

The invention describes a method for preparing host cells useful to produce a non- 
naturally occurring fusion protein comprising the steps of: a) selecting a host cell b) 
transforming the selected host cell with a recombinant vector having a structural nucleic 
acid sequence encoding a non-naturally occurring fusion protein, wherein the structural 
nucleic acid sequence comprises: i) a nucleic acid sequence encoding a peroxisome 
targeting protein subunit; and ii) a nucleic acid sequence encoding a polyhydroxyalkanoate 
synthase protein subunit; and c) obtaining transformed host cells. The vector may further 
comprise a selectable marker. The selectable marker may generally be any selectable 
marker suitable for use in the intended host cell, and more preferably for plants is a 
kanamycin resistance marker, a hygromycin resistance marker, or a herbicide resistance 
marker. The host cell may generally be any type of cell, and preferably is a fungal or plant 
cell. The fungal cell may generally be any type of fungal cell, and more preferably is a 
Schizosaccharomyces pombe, Streptomyces rimofaciens, Fusarium, Aspergillus niger* or 
Saccharomyces cerevisiae cell. The plant cell may generally be any type of plant cell, and 
more preferably is an alfalfa, banana, barley, bean, cabbage, canola/oilseed rape, carrot, 
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castorbean, celery, clover, coconut, corn, cotton, cucumber, linseed, melon, olive, palm, 
parsnip, pea, peanut, pepper, potato, potato, radish, rapeseed, rice, soybean, spinach, 
sunflower, tobacco, tomato, or wheat cell. 

The invention further describes a method of preparing a transformed plant useful to 
produce a non-naturally occurring fusion protein comprising the steps of: a) selecting a host 
plant cell b) transforming the selected host cell with a recombinant vector having a 
structural nucleic acid sequence encoding a non-naturally occurring fusion protein, wherein 
the structural nucleic acid sequence comprises: i) a nucleic acid sequence encoding a 
peroxisome targeting protein subunit; and ii) a nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit; c) obtaining transformed host plant ceils; 
and d) regenerating the transformed host plant cells. The vector may further comprise a 
selectable marker. The selectable marker may generally be any selectable marker suitable 
for use in the intended host cell, and more preferably is a kanamycin resistance marker, a 
hygromycin resistance marker, or a herbicide resistance marker. The host plant cell may 
generally be any type of plant cell, and more preferably is an alfalfa, banana, barley, bean, 
cabbage, canola/oilseed rape, carrot, castorbean, celery, clover, coconut, com, cotton, 
cucumber, linseed, melon, olive, palm, parsnip, pea, peanut, pepper, potato, potato, radish, 
rapeseed, rice, soybean, spinach, sunflower, tobacco, tomato, or wheat cell. The invention 
also encompasses the plant made by the above described methods. 

A preferred embodiment is a method for the preparation of a polyhydroxyalkanoate, 
comprising the steps of: a) obtaining a cell capable of producing a non-naturally occurring 
fusion protein, wherein the fusion protein comprises: i) a peroxisome targeting protein 
subunit; and ii) a polyhydroxyalkanoate synthase protein subunit; b) establishing a culture 
of the cell; and c) culturing the cell under conditions suitable for the production of the 
polyester. The method may further comprise isolating the polyhydroxyalkanoate from the 
cultured cell. The culture may further comprise fatty acids, and more preferably natural 
fatty acids, non-natural or synthetic fatty acids, or mixtures thereof. The cell may generally 
be any type of cell, and preferably is a ftmgal or plant cell. The fungal cell may generally be 
any type of fungal cell, and more preferably is a Schizosaccharomyces pombe, Streptomyces 
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rimofaciensy Fusarium, Aspergillus niger, or Saccharomyces cerevisiae cell. The plant cell 
may generally be any type of plant cell, and more preferably is an alfalfa, banana, barley, 
bean, cabbage, canola/oilseed rape, carrot, castorbean, celery, clover, coconut, corn, cotton, 
cucumber, linseed, melon, olive, palm, parsnip, pea, peanut, pepper, potato, potato, radish, 
rapeseed, rice, soybean, spinach, sunflower, tobacco, tomato, or wheat cell. The 
polyhydroxyalkanoate isolated from the cell may generally be any type of 
polyhydroxyalkanoate, and preferably comprises 3-hydroxyhexanoic acid (H:6), 3- 
hydroxyoctanoic acid (H:8), 3-hydroxydecanoic acid (H:10), 3-hydroxydodecanoic acid 
(H:12), 3-hydroxytetradecanoic acid (H:14), 3-hydroxyhexadecanoic acid (H:16), 3- 
hydroxyheptanoic acid (H:7), 3-hydroxynonanoic acid (H9), 3-hydroxyundecanoic acid 1 
(H:l l), 3-hydroxytridecanoic acid (H:13), 3-hydroxyhexadecatrienoic acid (HI 6:3), 3- 
hydroxyhexadecadienoic acid (H16:2), 3-hydroxyhexadecenoic acid (H16:l), 3- 
hydroxytetradecatrienoic acid (H14:3), 3-hydroxytetradecadienoic acid (H14:2), 3- 
hydroxytetradecenoic acid (H14:l), 3-hydroxydodecadienoic acid (H12:2), 3- 
hydroxydodecenoic acid (H12:l), 3 -hydroxy octenoic acid (H8:l), 4-hydroxydecanoic acid, 
8-methyl-3-hydroxynonanoic acid, or 6-methy 1-3 -hydroxyheptanoic acid monomers. 

In a further preferred embodiment, the invention presents a method for the 
preparation of a polyhydroxyalkanoate, comprising the steps of: a) obtaining a plant capable 
of producing a non-naturally occurring fusion protein, wherein the fusion protein comprises^ 
i) a peroxisome targeting protein subunit; and ii) a polyhydroxyalkanoate synthase protein 
subunit; and c) growing the plant under conditions suitable for the production of the 
polyhydroxyalkanoate. The method may further comprise the step of isolating the 
polyhydroxyalkanoate from the plant. The method may further comprise supplementing the 
plant with natural fatty acids, non-natural fatty acids, or mixtures thereof. The plant may 
generally be any type of plant, and preferably is an alfalfa, banana, barley, bean, cabbage, 
canola/oilseed rape, carrot, castorbean, celery, clover, coconut, com, cotton, cucumber, 
linseed, melon, olive, palm, parsnip, pea, peanut, pepper, potato, potato, radish, rapeseed, 
rice, soybean, spinach, sunflower, tobacco, tomato, or wheat plant. The 
polyhydroxyalkanoate isolated from the plant may generally be any type of 
polyhydroxyalkanoate, and preferably comprises 3-hydroxyhexanoic acid (H:6), 3- 
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hydroxyoctanoic acid (H:8), 3-hydroxydecanoic acid (H:10), 3-hydroxydodecanoic acid 
(H:12), 3-hydroxytetradecanoic acid (H:14), 3-hydroxyhexadecanoic acid (H:16), 3- 
hydroxyheptanoic acid (H:7), 3-hydroxynonanoic acid (H9), 3-hydroxyundecanoic acid 
(H:ll), 3-hydroxytridecanoic acid (H:13), 3-hydroxyhexadecatrienoic acid (H16:3), 3- 
hydroxyhexadecadienoic acid (HI 6:2), 3-hydroxyhexadecenoic acid (HI 6:1), 3- 
hydroxytetradecatrienoic acid (H14:3), 3-hydroxytetradecadienoic acid (H14:2), 3- 
hydroxytetradecenoic acid (H14:l), 3-hydroxydodecadienoic acid (H12:2), 3- 
hydroxydodecenoic acid (H12:l), 3-hydroxyoctenoic acid (H8:l), 4-hydroxydecanoic acid, 
8-methyl-3-hydroxynonanoic acid, or 6-methyl-3-hydroxyheptanoic acid monomers. 

4 

The invention further encompasses plants containing polyhydroxyalkanoates, 
wherein the polyhydroxyalkanoate comprises 3-hydroxyhexanoic acid (H:6), 3- 
hydroxyoctanoic acid (H:8), 3 -hydroxy decanoic acid (H:10)j 3-hydroxydodecanoic acid 
(H:12), 3-hydroxytetradecanoic acid (H:14), 3-hydroxyhexadecanoic acid (H:16), 3- 
hydroxyheptanoic acid (H:7), 3-hydroxynonanoic acid (H9), 3-hydroxyundecanoic acid 
(H:ll), 3-hydroxytridecanoic acid (H:13), 3-hydroxyhexadecatrienoic acid (HI 6:3), 3- 
hydroxyhexadecadienoic acid (H16:2), 3-hydroxyhexadecenoic acid (H16:l), 3- 
hydroxytetradecatrienoic acid (H14:3), 3-hydroxytetradecadienoic acid (H14:2), 3- 
hydroxytetradecenoic acid (HI 4:1), 3-hydroxydodecadienoic acid (HI 2:2), 3- 
hydroxydodecenoic acid (H12:l), 3-hydroxyoctenoic acid (H8:l), 4-hydroxydecanoic acid, 
8-methyl-3-hydroxynonanoic acid, or 6-methyl-3-hydroxyheptanoic acid monomers. 

In an alternative embodiment, the invention describes polyhydroxyalkanoates 
comprising 3-hydroxyhexadecatrienoic acid (HI 6:3), 3-hydroxyhexadecadienoic acid 
(H16:2), 3-hydroxytetradecatrienoic acid (H14:3), or 3-hydroxydodecadienoic acid (H12:2) 
monomers. 

# 

DESCRIPTION OF THE FIGURES 

The following figure forms part of the present specification and is included to further 
demonstrate certain aspects of the present invention. The invention may be better 
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understood by reference to the figure in combination with the detailed description f 
specific embodiments presented herein. 

Figure 1: GC-MS analysis of PHA in transgenic plants, Trans-esterified 
chloroform extracts from phaCl -transformed line 3.3 (1A, IB) and vector-transformed line 
5 21 (1C, ID) were analyzed. In panels 1A and 1C, the total ion chromatogram is presented, 
while on panel IB and ID, only ions with a mass-to-charge ratio of 103 are shown. 

DESCRIPTION OF THE SEQUENCE LISTINGS 

The following sequence listings form part of the present specification and are 
included to further demonstrate certain aspects of the present invention. The invention may 
io be better understood by reference to one or more of these sequence listings in combination 
with the detailed description of specific embodiments presented herein. 

Wild type PHA synthase CI nucleic acid sequence. 
Wild type PHA synthase C 1 protein sequence. 
Wild type PHA synthase C2 nucleic acid sequence. 

^ . ■ 

15 SEQ ID NO:4 Wild type PHA synthase C2 protein sequence. 

Forward PCR primer for PHA synthase CI fusion sequence. 
Reverse PCR primer for PHA synthase CI fusion sequence. 
Forward PCR primer for PHA synthase C2 fusion sequence. ( 
Reverse PCR primer for PHA synthase C2 fusion sequence. 
20 SEQ ID NO:9 Wild type isocitrate lyase nucleic acid sequence. 

Wild type isocitrate lyase protein sequence. 
Forward PCR primer for isocitrate lyase fusion sequence. 
Reverse PCR primer for isocitrate lyase fusion sequence. 
Nucleic acid sequence encoding the isocitrate lyase 
25 peroxisome targeting protein subunit. 

Isocitrate lyase peroxisome targeting protein subunit. 
PHA synthase CI nucleic acid sequence with plant preferred 
codon. 



SEQ 


IDNO:l 


SEQ 


ID NO:2 


SEQ 


ID NO:3 


SEQ 


ID NO:4 


SEQ 


ID NO:5 


SEQ 


ID NO:6 


SEQ 


ID NO:7 


SEQ 


ID NO:8 


SEQ 


ID NO:9 


SEQ 


ID NO: 10 


SEQ 


ID NO: 11 


SEQ 


ID NO: 12 


SEQ 


ID NO: 13 


SEQ 


ID NO: 14 


SEQ 


ID NO: 15 
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SEQ ID NO: 16 PHA synthase C2 nucleic acid sequence with plant preferred 

codon. 

SEQ ID NO: 1 7 Nucleic acid sequence encoding PHA synthase C 1 and 

isocitrate lyase fusion protein. 
SEQ ID NO: 18 PHA synthase CI and isocitrate lyase fusion protein. 
SEQ ID NO: 19 Nucleic acid sequence encoding PHA synthase C2 and 

isocitrate lyase fusion protein. 
SEQ ID NO:20 PHA synthase C2 and isocitrate lyase fusion protein. 
SEQIDNO:21 PCR amplified nucleic acid sequence encoding wild type 

Candida albicans MFP. 
SEQ ID NO:22 Wild type Candida albicans MFP protein. 
SEQ ID NO:23 PCR amplified nucleic acid sequence encoding SKL mutant 

Candida albicans MFP. 
SEQ IDNO:24 Candida albicans MFP protein with SKL substitution for 

AKI. 

SEQ ID NO:25 PCR amplified nucleic acid sequence encoding mutant 

Candida albicans MFP lacking AKI sequence. 
SEQ ID NO:26 Candida albicans MFP protein lacking AKI sequence. 



DEFINITIONS 

The following definitions are provided in order to aid those skilled in the art in 
understanding the detailed description of the present invention. 



"Acyl-ACP thioesterase" refers to proteins which catalyze the hydrolysis of acyl- 
ACP thioesters. 



"C-terminal region" refers to the region of a peptide, polypeptide, or protein chain 
from the middle thereof to the end that carries the amino acid having a free a carboxyl group 
(the C-terminus). 
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"CoA" refers to coenzyme A. 



The phrases "coding sequence", "open reading frame", and "structural sequence" 
refer to the region of continuous sequential nucleic acid triplets encoding a protein, 
polypeptide, or peptide sequence. 

The term "encoding DNA" or "encoding nucleic acid" refers to chromosomal 
nucleic acid, plasmid nucleic acid, cDNA, or synthetic nucleic acid which codes on 
expression for any of the proteins or fusion proteins discussed herein. 

"Fatty acyl hydroxylase" refers to proteins which catalyze the conversion of fatty 1 
acids to hydroxylated fatty acids. 

The term "gene" refers to chromosomal DNA, plasmid DNA, cDNA, synthetic 
DNA, or other DNA that encodes a peptide, polypeptide, protein, or RNA molecule, and 
regions flanking the coding sequence involved in the regulation of expression. 

The term "genome" as it applies to bacteria encompasses both the chromosome and 
plasmids within a bacterial host cell. Encoding DNAs of the present invention introduced 
into bacterial host cells can therefore be either chromosomally-integrated or plasmid- 
localized. The term "genome" as it applies to plant cells encompasses not only^ 
chromosomal DNA found within the nucleus, but organelle DNA found within subcellular 
components of the cell. DNAs of the present invention introduced into plant cells can 
therefore be either chromosomally-integrated or organelle-localized. 

"Glyoxysome" and "peroxisome" refer to the same organelle in a plant. 
Glyoxysome refers to a type of peroxisome found in germinating seedlings, senescing 
tissues, or in dark-grown tissues. Glyoxysomes and peroxisomes contain enzymes 
responsible for the conversion of lipids to carbohydrates. 

* 

"Identity" refers to the degree of similarity between two nucleic acid or protein 
sequences. An alignment of the two sequences is performed by a suitable computer 
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program. A widely used and accepted computer program for performing sequence 
alignments is CLUSTALW vl.6 (Thompson, et aL NucL Acids Res. y 22: 4673-4680 (1994)). 
The number of matching bases or amino acids is divided by the total number of bases or 
amino acids, and multiplied by 100 to obtain a percent identity. For example, if two 580 
base pair sequences had 145 matched bases, they would be 25 percent identical. If the two 
compared sequences are of different lengths, the number of matches is divided by the 
shorter of the two lengths. For example, if there were 100 matched amino acids between 
200 and a 400 amino acid proteins, they are 50 percent identical with respect to the shorter 
sequence. 

The terms "microbe" or "microorganism" refer to algae, bacteria, fungi, and 
protozoa. 

■ 

"N-terminal region" refers to the region of a peptide, polypeptide, or protein chain 
from the amino acid having a free a amino group to the middle of the chain. 

"Nucleic acid" refers to ribonucleic acid (RNA) and deoxyribonucleic acid (DNA). 

A "nucleic acid segment" is a nucleic acid molecule that has been isolated free of 
total genomic DNA of a particular species, or that has been synthesized. Included with the 
term "nucleic acid segment" are DNA segments, recombinant vectors, plasmids, cosmids, 
phagemids, phage, viruses, etcetera. 

"Overexpression" refers to the expression of a polypeptide or protein encoded by a 
DNA introduced into a host cell, wherein said polypeptide or protein is either not normally 
present in the host cell, or wherein said polypeptide or protein is present in said host cell at a 
higher level than that normally expressed from the endogenous gene encoding said 
polypeptide or protein. 

The term "plastid" refers to the class of plant cell organelles that includes 
amyloplasts, chloroplasts, chromoplasts, elaioplasts, eoplasts, etioplasts, leucoplasts, and 
proplastids. These organelles are self-replicating, and contain what is commonly referred to 
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as the "chloroplast genome," a circular DNA molecule that ranges in size from about 120 to 
about 217 kb, depending upon the plant species, and which usually contains an inverted 
repeat region (Fosket, Plant growth and Development, Academic Press, Inc., San Diego, 
CA,p. 132(1994)). 

<4 Polyadenylation signal" or "poly A signal" refers to a nucleic acid sequence located 
3* to a coding region that directs the addition of adenylate nucleotides to the 3* end of the 
mRNA transcribed from the coding region. 

The term "polyhydroxyalkanoate (or PHA) synthase" refers to enzymes that convert 1 
hydroxyacyl-CoAs to polyhydroxyalkanoates and free CoA. 

The term "promoter" or "promoter region" refers to a nucleic acid sequence, usually 
found upstream (5') to a coding sequence, that controls expression of the coding sequence 
by controlling production of messenger RNA (mRNA) by providing the recognition site for 
RNA polymerase and/or other factors necessary for start of transcription at the correct site. 
As contemplated herein, a promoter or promoter region includes variations of promoters 
derived by means of ligation to various regulatory sequences, random or controlled 
mutagenesis, and addition or duplication of enhancer sequences. The promoter region 
disclosed herein, and biologically functional equivalents thereof, are responsible for driving^ 
the transcription of coding sequences under their control when introduced into a host as part 
of a suitable recombinant vector, as demonstrated by its ability to produce mRNA. 

"Protein subunit" refers to a protein sequence that is part of a fusion protein. 
Examples are p-galactosidase, FLAG, green fluorescent protein, and in the instant 
invention, polyhydroxyalkanoate synthase, and a peroxisome or glyoxysome targetting 
peptide. 

"PTS2" refers to an N-terminal protein subunit having the sequence 
(R/K)(L/Q/I)XXXXX(H/Q)L, wherein X is any amino acid. 
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"Regeneration" refers to the process of growing a plant from a plant cell (e.g., plant 
protoplast or explant). 

"Transformation" refers to a process of introducing an exogenous nucleic acid 
sequence (e.g., a vector, recombinant nucleic acid molecule) into a cell or protoplast in 
which that exogenous nucleic acid is incorporated into a chromosome or is capable of 
autonomous replication. 

A "transformed cell" or "transgenic cell" is a cell whose DNA has been altered by 
the introduction of an exogenous nucleic acid molecule into that cell. 

A "transformed plant" or "transgenic plant" is a plant whose DNA has been altered 
by the introduction of an exogenous nucleic acid molecule into that plant, or by the 
introduction of an exogenous nucleic acid molecule into a plant cell from which the plant 
was regenerated or derived. 

The following examples are included to demonstrate preferred embodiments of the 
invention. It should be appreciated by those of skill in the art that the techniques disclosed 
in the examples which follow represent techniques discovered by the inventors to function 
well in the practice of the invention, and thus can be considered to constitute preferred 
modes for its practice. However, those of skill in the art should, in light of the present 
disclosure, appreciate that many changes can be made in the specific embodiments which 
are disclosed and still obtain a like or similar result without departing from the spirit and 
scope of the invention. 

EXAMPLES 

EXAMPLE 1 : Plant material 

Arabidopsis thaliana, race Columbia, was transformed by the vacuum infiltration 
method (Bechtold, N., et al., C.R. Acad. Sci. Paris 316: 1 194-1 199 (1993)). Transformants 
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were selected on media containing Murashige and Skoog salts ("MS", Murashige, T. and 
Skoog, F., Physiol. Plant. 15: 473-497 (1962)), 1% (w/v) sucrose, 0.7% (w/v) agar and 50 
Hg/mL kanamycin. Kanamycin-resistant plants were subsequently transferred to soil and 
grown under continuous fluorescent light at 19°C. In some experiments, plants were grown 
under constant agitation (100 rpm) for 1-2 weeks in liquid media containing MS salts and 
2% sucrose. 

EXAMPLE 2: Cloning of peroxisomal! v targeted PHA synthases C I and C2 

■ 

The phaCl and phaCl genes were obtained from Steinbtichel (Timm, A. and 1 
Steinbtichel, A., Eur. J. Biochem. 209: 14-30 (1992), GenBank Accession Number 
X66592). PCR was used to amplify the genes and to modify their 5'- and S'-termini as 
follows: At the 5'-end the codons encoding the serine-2 and the arginine-2 residue of 
phaCl and phaC2, respectively, were modified to conform more closely with the general 
codon preferences of A. thaliana (Meyerowitz, E. M. in Methods in Arabidopsis research , 
eds. Koncz, C, Chua, N.-H. & Schell, J. (World Scientific Publishing, Singapore), pp. 100- 
119 (1992)). At the 3 '-end the sequences were modified to obtain suitable cloning sites and 
to delete the stop codons to enable the construction of chimerical fusions with the 
peroxisomal targeting sequence. 

I 

The carboxy-terminal 35 amino acid residues of the isocitrate lyase gene (ICL) 
(Olsen, L.J., et aL, Plant Cell 5: 941-952 (1993), GenBank Accession Number Y13356) 
from Brassica napus were used as targeting sequence for the PHA synthases CI and C2. It 
has been shown previously that this sequence was sufficient to ensure the peroxisomal 
localization of the chloramphenicol acetyl transferase (CAT) to the peroxisomes in A. 
thaliana (Comai, L. et aL, The Plant Cell 1: 293-300 (1989); Olsen, L. J. et aL, The Plant 
Cell 5: 941-952 (1993); Zhang, J. Z. et aL, Mol Gen. Genet 238: 177-184 (1993)). A PCR 
product encoding the ICL targeting sequence was cloned into the vector pART7 (Gleaves, 
A.P., Plant Mol Biol 20: 1202-1207 (1992), GenBank Accession Number X69707). The 
PCR products containing the phaCl or phaC2 genes were cloned 5 '-upstream of the ICL 
sequence to produce a contiguous open reading frame encoding the targeted fusion proteins. 
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The 5'- and 3 '-ends of the genes in the resulting plasmids pART7jphaCl_ICL and 
pART7_phaC2_ICL were sequenced to verify the modifications. 

The PHA accumulation-deficient mutant Pseudomonas putida KT2440 NK2:3 was 
obtained from Steinbuchel for complementation studies to verify the enzyme activities of 
the modified PHA synthases CI and C2. The phaClJCL and phadJCL genes were 
cloned into the broad-host range plasmid pVLT35 behind the IPTG-inducible tac-promoter 
(Lorenzo, V, et al., Gene 123: 17-24 (1993)) and electroporated into the P. putida mutant. 
Streptomycin-resistant transformants were subcultured onto minimal medium containing 
either octanoate or gluconate as sole carbon source. The Nile Blue A fluorescence stain 
(Page, W. J. and C. J. Tenove, Biotechnology Techniques 10: 215-220 (1996)) was used to 
visualize PHA accumulation. Upon IPTG induction PHA accumulation was observed with 
pVLT35_phaCl_ICL and pVLT35_phaC2_ICL, but not with pVLT35 alone, thus 
indicating that the modified genes were still active. 

r 

EXAMPLE 3: Plant transformation and screening for PHA synthase CI transgenic 

plants 

The Notl-cassettes of plasmids pART7_phaClJCL and pART7_phaC2_ICL 
containing the modified genes flanked by the Cauliflower mosaic virus 35S promoter 
(CaMV35S) and the octapine synthase (ocs) 3'-terminator were cloned into the plant binary 
vector pART27 tp obtain p ART27_phaC 1 JCL and pART27_phaC2JCL. These plasmids 
were transformed into A. thaliana ecotype Columbia by Agrobacterium GV3101-mediated 
transfer utilizing an in planta vacuum-infiltration method (Bechtold, N. et al., GJt Acad. 
Scl Paris 316: 1194-1199 (1993)). Transgenic Tl plants were selected for antibiotic 
resistance during germination of the seeds of infiltrated plants on plant growth medium 
containing mineral salts, sucrose and kanamycin. Negative control plants containing only 
the insert-less T-DNA of the vector pART27 were obtained in the same way. 

Transgenic PHAC1 plants (Tl) expressing high amounts of PHA synthase CI were 
selected by Western analysis with an antiserum against the PHA synthase CI, which was 
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obtained from Steinbuchel's laboratory. Unfortunately no antibodies against PHA synthase " 
C2 were found to be suitable, so a different screening strategy was used, see below. Six 
independent lines expressing varying quantities of PHA synthase CI were obtained from 12 
originally infiltrated plants, which had been harvested individually (another 19 have not yet 
been investigated). Initially some problems with the western analysis were encountered, 
one of which was the precipitation of the PHA synthase in plant protein extracts upon 
freezing. Analysis of the kanamycin segregation of the second generation (T2) and third 
generation (T3) plants indicated that three of these lines contained multilocus T-DNA 
inserts. Initially these lines exhibited the highest expression of PHA synthase CI as judged 
by western analysis, however, the expression of the transgene in these lines was variable in 
plants of the T2 and T3 generation and complete "silencing" was observed. The line 
PHAC1#3.3 was finally chosen for further studies, because it contained a single-locus T- 
DNA insert and exhibited stable expression of the transgene as seen on the western blot 

EXAMPLE 4: PHA production bv PHAC1 plants 

A protocol for the detection of monomers of PHA by gas chromatography was 
developed based on the method described for the extraction of PHB from Arabidopsis 
(Poirier, Y. et al., Int J. Biol Macromol 17: 7-12 (1995)). Whole leaves were extracted 
several times with ethanol and methanol to elute all the soluble lipids, thereafter chloroform 
and methanol acidified with 3% (v/v) H 2 S0 4 were added in equal volumes and the reactions 
were put at 98°C for 4 hours to transesterify the PHA polyester. GC-chromatograms of the 
resulting chloroform extracts showed a large number of peaks, most of which were due to 
the derivatization of various leave compounds. Peaks corresponding to the standards of the 
expected methyl esters of PHA monomers were, however, distinguishable amongst the 
others. A large fraction of the plant material was solubilized during this transesterification 
treatment, it was however not determined whether underivatized PHA remained in the solid 
underivatized material. This made the quantification of the PHA in plant material slightly 
uncertain, but the authors estimated intuitively that most of the PHA in the material became 
derivatized preferentially. The GC-standards (from Sigma Chemical, St. Louis, MO, except 
H6 which was from Beat Keller) were the methyl esters of D-3-hydroxy-hexanoic acid (3- 

-22- 



WO 99/35278 PCT/US 9 8/00083 

OH-caproic acid, H6 monomer), DL-3-hydroxy-octanoic acid (3-OH-caprylic acid ? H8 
monomer), DL-3-hydroxy-capric acid (H10 monomer), DL-3-hydroxy-lauric acid (H12 
monomer) and DL-3-hydroxy-myristic acid (HI 4 monomer). 

The transgenic plants expressing the PHA synthase CI showed a significant increase 
in the size of the peaks corresponding to the H6-H14 monomers compared to the negative 
control plants. One novel peak was found only in PHAC1 plants and never in the negative 
controls. GC-MS was used to confirm that the peaks observed in both the PHAC1 plants 
and the negative controls were really identical to the standards and the novel peak was 
determined as being due to 3-hydroxy-octenoyl-methyl-ester containing a single unsaturated 
bond (H8:l monomer). It is being speculated that the unsaturated bond is located at carbon 
5 and has the cis conformation and that this monomer is due to the degradation of ct- 
linolenic acid (18:3, all-cis, A9,12,15) and 16:3 (all-cis, A7, 10, 13) by P-oxidation. This 
reasoning is based on the prediction, that a D-3-hydroxy-octenoyl-CoA p-oxidation 
intermediate arises due to the cis-double bond at the even-numbered carbons (Gerhardt, B., 
Lipid metabolism in plants (Moore, T. S., Jr., ed.), CRC Press Inc., pp. 527-565 (1993)); see 
further discussions below under feeding studies). The same argument can be taken for the 
generation of the other monomers incorporated into the PHA, i.e. that they originated from 
fatty acids having a double bond at even-numbered carbons, which resulted in the formation 
of D-3-hydroxy-acyl-CoA p-oxidation intermediates. Thus the H8 monomer would 
originate from the degradation of linoleic acid (CI 8:2, all-cis, A9,12) or from CI 6:2, all-cis, 
A7, 10. This however does not satisfactorily explain the whole range of monomers 
observed, e.g. the H6 monomer would then have to originate from the fatty acids CI 8:1, 
A14-cis or C16:l, A12-cis, while the H14 monomer would have to originate from C18:l, 
A8-cis, or C16:l, A4-cis or C14:l, A2-cis, etcetera. As most of these would be rather 
uncommon fatty acids in A. thaliana, another argument for the origin of these PHA 
monomers can be proposed, which is based on the existence of an epimerase activity in 
plant P-oxidation (Preisig-Mtiller, R. et al., J. Biol Chem. 269: 20475-20481 (1994)). In 
this case the D-3-OH-acyl-CoA P-oxidation intermediates are generated at a low rate by the 
"reverse" reaction catalyzed by the epimerase required for the conversion of D-3-hydroxy- 
acyl-CoA to the L-form, and sequestration of these D-intermediates into PHA actually 
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drives the reverse reaction. In this way the whole range of possible monomers can be 
explained, while the argument involving the unsaturated bond at even-numbered carbons in 
the acyl chains would still explain the relatively higher proportion of the H8-monomer and 
the existence of the H8:l monomer. 

Several negative control plants (both A. thaliana wild type and pART27 transgenic 
plants) were analyzed in various experiments without ever seeing more than only trace 
amounts of the various saturated monomers. The concentrations present in the negative 
controls were at least 1000 times smaller than in the positive plants, close to the detection 
limit of the methods at our availability. This was done by utilizing the GC-MS in the SIM 1 
mode (selected ion monitoring; ion 103 is characteristic for all of these 3-OH-fatty acid 
methyl esters) for which the detection limit was found to be approximately 4 pg/jiL of the 
various standards. These compounds in the negative controls might also be intermediates of 
P-oxidation, i.e. mostly the L-3-hydroxy-acyI-CoAs and perhaps even very low amounts of 
the D-form, which are normally present at very low concentrations in the plant material in 
which p-oxidation is taking place. A rough calculation indicated a total PHA content of 
0.03% (w/dry weight) in PHACi#4.4 (multilocus plant), which related to approximately 5 
jig of PHA in a large fresh leave weighing 155 mg. It was approximated that line 
PHAC1#3.3 produced 0.01% (weight/dry weight) in soil-grown plants. 

I 

EXAMPLE 5: Screening for PHA synthase C2 expressing plants 

PHAC2 plants were screened directly for PHA production by analysis of dry leaves 
of T2 plants. Almost all of the T2 plants derived from 13 independently transformed plants 
were found to produce PHA in varying quantities, as judged by the presence of the novel 
peak due to the C8:l monomer and also the peaks of the other PHA monomers. The highest 
producing plants were analyzed further and homozygous T3 plants were obtained. Two 
homozygous single-locus T3 lines were selected, PHAC2#19.5 and PHAC2#8.6. In 
comparison to PHAC1#3.3 plants, these PHAC2 plants produced slightly smaller quantities 
of PHA in seedlings grown on plates containing MS salts, kanamycin and sucrose. The 
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monomer composition of the respective transgenic plants was however identical. For that 
reason most of the further studies were only done with line PHAC1#3.3. 

■ 

EXAMPLE 6: Immunolocalization and observation of PHA granules 

For the immunolocalization of the peroxisomally-targeted PHA synthase CI, T3 
seedlings of lines PHAC1#3.3 and pART27#21 (negative control) were grown on plates 
containing MS salts, kanamycin and sucrose. Seedlings were grown for 7 days under 
continuous light or in the dark after one day of illumination, the latter was done to obtain 
etiolated seedlings in which glyoxysomes are more abundant. The seedlings were fixed and 
sent together with some anti-PHA synthase CI antiserum to Prof. Leech's laboratory at the 
University of York, where the immunolocalization was performed. It was found that the 
peroxisomes in PHAC 1 seedlings were initially difficult to identify, since they did not look 
normal due to the presence of granules within them. These granules were very abundant in 
the etiolated seedlings, while in the light-grown seedlings most of the peroxisomes still 
looked normal or seemed to contain only tiny granules. The PHA synthase CI was located 
in what seem to be two different types of organelles or peroxisomes, because the one 
contains a large quantity of PHA granules while the other contains apparently none. The 
darker peroxisomes without granules corresponded in appearance most closely to the normal 
peroxisomes in the negative controls. It is possible that this apparent heterogeneity is simply 
the results of non-homogenous distribution of granules within the peroxisomes. Glycolate 
oxidase was used as marker enzyme for peroxisomes of seedlings grown under light, while 
rubisco was used as chloroplastic marker. Antibodies against these two marker enzymes 
clearly identified the respective organelles in both PHAC1 seedlings and in the pART27 
negative controls. Glycolate oxidase was found to be located in the organelles, i.e. the 
peroxisomes, containing PHA granules. Similarly the enzyme isocitrate lyase (ICL) was 
used as glyoxysomal marker in etiolated seedlings and it also confirmed that the granule- 
containing organelles were glyoxysomes. The antiserum against PHA synthase CI 
unambiguously identified the peroxisomal localization of the PHA synthase in the PHAC1 
seedlings, while it did not detect anything in the negative controls. Unusual accumulations 
of granules were also observed occasionally in the vacuoles of etiolated PHAC1 seedlings 
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and these globules were gold-labelled with anti-PHA synthase CI. This was in - 
correspondence with the observation that the PHB synthase is found on the surface of PHB 
granules in bacteria (Gerngross, T. U. et al., J. Bacterid 175, 5289-5293 (1993)). 

EXAMPLE 7: Changing PHA yield and monomer composition in feeding studies 

Line PHAC1#3.3 was used to investigated if the total yield of PHA could be 
increased or if PHAs containing other monomers than the "native" PHA could be 
synthesized in PHAC1 transgenic plants. For that purpose seeds were sterilized and 
germinated in liquid medium containing mineral salts and 2% (w/v) sucrose supplemented ( 
with fatty acids or other compounds known to be degraded by (J-oxidation. In experiment 
#1 the seedlings were grown for 3 days in the light before the substrates were added and the 
plant were moved into the dark. The material was harvested after 8 days and derivatized 
samples were analyzed by gas chromatography. 

The results summarized in Table 1 point out several encouraging aspects. The yield 
of native PHA (obtained without feeding any substrate) was doubled when seedlings were 
germinated in the dark as opposed to continuous illumination. This could perhaps be 
ascribed to a more complete mobilization of the seed lipids in etiolated seedlings. In this 
respect the regulation of the glyoxylate cycle enzymes malate synthase and isocitrate lyase 
might play a role by affecting lipid-mobilization via p-oxidation. It has been shown that 
these glyoxylate cycle enzymes are regulated transcriptionally by three types of signal, 
namely light regulation, carbon catabolite repression by various sugars and developmental 
regulation during germination and senescence (Graham, I. A. et al., Plant MoL Biol. 15: 
539-549 (1990); Graham, I. A. et al., Plant Cell 4:349-357 (1992); Graham, I. A. et al., 
Plant Cell 6: 761-772 (1994)). 

The large increase in the PHA yield obtained by the feeding of TWEEN-20 (Sigma; 
50% palmitic acid (CI 6) esterified with polyoxyethylenesorbitol, the remainder is made up 
by lauric acid (CI 2) and myristic acid (CI 4) also esterified) (TWEEN is a registered 
trademark of ICI Americas, Inc., Wilmington, DE) indicated that the PHA synthase was 
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very active in these plants and thus not responsible for the relatively low yield of native 
PHA in seedlings grown without added fatty acids. The most pronounced effect of TWEEN- 
20 on the monomer composition was the decrease in the content of the H8:l monomer from 
about 30% in native PHA to about 1%, which was most likely due to the lack of unsaturated 
fatty acid derivatives in the TWEEN-20. The relative distribution of the other monomers 
could be explained by the step-by-step 0-oxidation of the CI 6, CI 4 and C12 components in 
TWEEN-20. A negative effect on seedling growth due to TWEEN-20 was observed, but it 
was small considering its high concentration (5% v/v) in the medium. 

The accumulation of PHA granules in PHAC 1 seedlings grown in liquid cultures 
supplemented with 5% TWEEN-20 under constant illumination for 12 days was very 
striking on electron microscope micrographs. These PHA granules were not observed in the 
negative controls, i.e. pART27 transgenic seedlings fed with TWEEN-20. The granules 
looked different from the starch granules observed in chloroplasts. These electron 
microscopic studies were done in our own institute by Mrs J. Pet6tot and the results 
confirmed similar results obtained with etiolated seedlings in Prof. Leech's laboratory. 

i 

TWEEN-60 (Sigma; 50% stearic acid (CI 8) and some palmitic and myristic acid; all 
esterified to polyoxyethylenesorbitol) and TWEEN-80 (Sigma; 50% oleic acid (CI 8:1), 
esterified to polyoxyethylenesorbitol) had less impact on the PHA yield, the monomer 
composition and the seedling growth than TWEEN-20. The relatively high level of the 
H8:l monomer might be due to a higher contamination of TWEEN-60 and -80 with 
unsaturated fatty acids like ct-linolenic acid, see above. 

The free fatty acids hexanoate and octanoate were fed at very low concentrations due 
to their toxic effects on plant growth. For hexanoate a large increase of the H6 monomer 
was observed, while octanoate resulted in a very high increase of the H8 monomer together 
with a moderate increase in the H6 monomer. For both substrates the H8:l monomer 
content remained relatively high, which was probably due to the normal accumulation of 
PHA from endogenous lipid P-oxidation ("native" PHA). 
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Table 1. Increasing the total yield of PHA and changing its monomer composition in 
PHAC 1 seedlings germinated in liquid media supplemented with fatty acids 
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17 
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18 
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3.2 


44 


20 


15 


14 


4.0 


Jlrlexanoate (C6)° | 


p.os 


dark 


70±3 


11 


30 


32 


21 


7.0 


7.4 


3.1 


(Octanoate (C8) c jO.005 


dark 


125 ±44 


16 


5.2 


73 


13 


3.7 


3.7 1 


1.5 j 



a The transesterified plant material (of specified weight) was in a volume of 1 mL 

chloroform, of which 1 \xL was analyzed by GC. 
b An average of 30 seedlings were grown per sample. 
c Samples were done in duplicate and the results were averaged. 



In experiment #2 (Tables 2 and 3) the seedlings were germinated for 8 days under 
continuous illumination, then the growth medium was replaced by the same medium 
containing 5% (v/v) TWEEN-80 together with various fatty acids, the purpose of the 
TWEEN-80 was to solubilize the water-insoluble fatty acids. The samples were placed 
back under constant illumination for another 6 days before being harvested and analysed. 
All samples were done in duplicate and each sample contained approximately 45 seeds 
which were germinated together in a large capped test-tube. Negative controls with 
pART27 plants were done for each substrate in the identical fashion. None of the novel 
PHA-monomer peaks were found in these negative controls. 

Feeding of the saturated fatty acid tridecanoic acid (CI 3) and the branched fatty acid 
8-methyl-nonanoic acid (8M-G9) resulted in the incorporation of a whole range of novel 
monomers. The identity of all these novel monomers was established by GC-MS. All of 
them had an uneven number of carbon atoms in their acyl chains and could be directly 
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traced to the original fatty acid supplied in the medium or intermediates of its degradation 
by P-oxidation. For tridecanoic acid, transgenic PHAC1 plants were found to contain a 
polymer having H13-, HI 1-, H9- and H7-3-hydroxy-alkanoic acid monomers. In the case 
of 8M-C9 the two novel monomers, 8-methyl-3-D-hydroxy-nonanoic acid (8M-H9) and 6- 
methyl-3-D-hydroxy-heptanoic acid (6M-H7), retained the branched structure of the original 
substrate. This shows that the PHA synthase C 1 was able to incorporate a large variety of 
monomers into the polymer, provided that intermediates having the proper conformation 
were generated. The descending order in terms of quantities of the novel monomers 
(H13>H11>H9>H7; and 8M-H9>6M-H7) suggests that the P-oxidation of these unusual 
fatty acids proceeds slowly, thus permitting more time for intermediate-sequestration by the 
PHA synthase. It is possible that the 3-hydroxy-acyl-CoA dehydrogenase (MFP) and some 
other enzymes of the P-oxidation cycle have a low substrate specificity for these fatty acids 
and their derived intermediates. 

Feeding of petroselenic acid (CI 8:1, 6-cis) resulted in a large increase in the content 
of the H 14 monomer^ This observation was in agreement with the proposed scheme of its 
degradation by p-oxidation (Gerhardt, B., Lipid metabolism in plants (Moore, T. S., Jr., ed.), 
CRC Press Inc., pp. 527-565 <1993)). All unsaturated bonds in the cis-conformation 
starting at an even-numbered carbon in the acyl chain were proposed to present obstacles to 
the normal cycle of the p-oxidation and had to be circumvented by modifications of the 
pathway. This is because the D-3-hydroxy-acyl-CoA can be formed by the action of the 
enoyl-CoA hydratase (MFP) from 2-cis-enoyl-CoA (cis-unsaturated bond in even-numbered 
position), but the D-3-hydroxy-acyl-CoA cannot be utilized by the 3-hydroxy-acyl-CoA 
dehydrogenase (MFP), which can only act on the L-3-hydroxy-acyl-CoA. Three possible 
modifications were put forward: 1) An epimerase converts the D-3-hydroxy-acyl-CoA to 
the L-form. 2) A dehydratase (also called D-3-hydroxyacyl-CoA hydrolyase or D-specific 2- 
trans-enoyl-CoA hydratase II, see Engeland, K. and Kindl, H., Eur. J. Biochem. 200: 171- 
178 (1991)) converts the D-3-hydroxy-acyl-CoA to 2-trans-enoyl-CoA, which can then be 
reconverted to L-3-hydroxy-acyl-CoA by the enoyl-CoA hydratase I. 3) A 2,4-dienoyl- 
CoA reductase reduces the 2-trans-4-cis-acyI-CoA p-oxidation intermediate to the 3-cis- 
enoyl-CoA, which in turn will require the activity of an isomerase to form the 2-trans-enoyl- 
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CoA p-oxidation intermediate. The first two options would result in the generation of D-3- 
hydroxy-acyl-CoA intermediates which would be directly available to the PHA synthase. 
Thus the observation of the specific increase in the H14 monomer upon feeding with 
petroselenic acid fits well with the predicted modifications of the P-oxidation to bypass the 
cis-unsaturated bond at carbon 6 of petroselenic acid. The same modifications have also 
been used above to explain the presence of the 3-hydroxy-octenoyl monomer (H8:l) in the 
native PHA. It was speculated that this monomer was due to the degradation of 18:3, all- 
cis-A9, 12, 15 and 16:3, all-cis-A7, 10, 13 by p-oxidation. The high proportion of H8 
monomer could similarly be due to the degradation of linoleic acid (18:2, all-cis-A9,12) 
which is an abundant fatty acid in plant material. I 

The degradation of fatty acids containing hydroxy groups on even-numbered carbon 
atoms in either the D- or the L-conformation also poses obstacles to the normal p-oxidation 
pathway and modifications are required to bypass these (Gerhardt, B., Lipid metabolism in 
plants (Moore, T. S., Jr., ed.), CRC Press Inc., pp. 527-565 (1993)). The D-4-hydroxy- 
decanoate-CoA and D-2-hydroxy-octanoate-CoA intermediates were predicted to arise in 
the degradation of ricinoleic acid (D-12-hydroxy-oleic acid (9-cis)). To investigate whether 
these intermediates might be incorporated into the PHA polymer by the PHA synthase, 
ricinoleic acid was used to supplement the medium in which PHAC1 plants were 
germinating. No major peaks due to the incorporation of novel monomers into the PHA 
polymer were detected, but GC-MS analysis was utilized to search for specific predicted" 
novel monomers by looking for characteristic fragmentation products, namely ions 1 17 and 
89. A small peak was found with ion 117, this peak showed the fragmentation fingerprint of 
the D-4-hydroxy-decanoate-methyl ester and was absent in the corresponding negative 
control. No novel peak was found with ion 89, thus excluding the possibility that the D-2- 
hydroxy-octanoate was incorporated into the polymer. It is known that the PHA synthase 
can incorporate D-4-hydroxy- and D5-hydroxy monomers into PHA in bacterial cultures, 
therefore the incorporation of the D-4-hydroxy-decanoate in the germinating seeds fed with 
ricinoleic acid was plausible. The very low abundance of the monomer could perhaps be 
explained by an alternative and more efficient pathway for the degradation of ricinolate 
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(Gerhardt, B., Lipid metabolism in plants (Moore, T. S., Jr., ed.), CRC Press Inc., pp. 527- 
565 (1993)). 



Table 2. Quantity of PHA production in PHAC1 seedlings germinated in liquid medium 
supplemented with fatty acids 
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458 ±8 
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5.0 



Tridecanoic 
(C13) + T 



acid 



0.1 



276 ±9 



28 



8-methyl-nonanoic 
acid (8M-C9) + T 



0.1 



48 ± 14 



46 



Petroselenic acid 
(C18:l,6-cis) + T 



1 



287 



9.4 



Ricinoleic acid 
(D12-OH-C18:l, 9- 
cis) + T 



0.1 



215±21 



6.0 



I 



8 The plant material (of specified weight) was transesterified in different volumes, but the 
integrated peak-areas were calculated to homologate the sample- volumes (1 mL chloroform, 
of which 1 uL was analyzed by GC). 
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3 8M-H9 and 6M-H7 refer to 8 -methy 1-3 -D-hy droxy-nonanoic acid and 6-methyl-3-D- 
5 hydroxy-heptanoic acid, respectively. 

b 4-OH-H10 refers to D-4-hydroxy-decanoate. 

c The quantity of 4-OH-H10 was estimated by comparing peak sizes with H6 on a GC-MS 
chromatogram. 

EXAMPLE 8: Extraction of high molecular weight PHA 

io The presence of derivatized monomers of PHA in PHAC1 plants had been 

established by the GC-analysis of trans-esterified intact plant material. To prove that the 
PHA was synthesized as high-molecular weight polymer and for its physico-chemical 
characterization, the purification of large quantities (i.e. in the mg range) was undertaken. 
Seeds of PHAC1#3.3 were germinated in liquid medium with and without addition of 

is TWEEN-20 in order to obtain TWEEN-20-derived PHA or unmodified PHA, respectively. 
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For the TWEEN-20-derived PHA, approximately 1 6000 seeds (313 mg dry seeds) 
were germinated in 900 mL l/2xMS + 1% sucrose medium for 7 days under continuous 
illumination on a shaker, the medium was replaced with l/2xMS + 2% sucrose containing 
5% TWEEN-20 and the seedlings were grown for another 9 days in the light. The plant 
material was harvested, washed extensively with water to remove residual TWEEN-20, 
frozen and lyophilized. The dry material was ground with a mortar and pestle, weighed, and 
lipids were extracted by a six-hour Soxhlet-extraction with methanol. The methanol- 
insoluble PHA was extracted for 24 hours in the same manner with chloroform. The 
chloroform extract was concentrated under reduced pressure and the PHA was precipitated 
by the addition of 10 volumes of cold methanol. This methanol precipitation was performed 
twice to ensure a high purity of the PHA. 27 mg of PHA was thus obtained from 5.35 g 
lyophilized and powdered seedling material, which related to 0.50% weight/dry weight. 
The PHA was trans-esterified and analyzed by GC. It was found that 58% of the PHA 
present in the methanol-extracted plant powder was extracted by the chloroform. It has 
been established in previous experiments that this remaining PHA was recalcitrant to 
extraction. The chromatogram showed that the extracted PHA was adequately pure with the 
peaks of the six identified monomers constituting 93% of the total integrated area. The ratio 
of the integrated areas between the different monomers was very similar to the result shown 
in Table 1 for the sample containing TWEEN-20 and grown under light, see Table 4. 

For the extraction of high-M r PHA produced by PHAC1 plants without additional 
fatty acid supplements (native PHA), 1076 mg seeds (approx. 54000 seeds) were 
germinated in 3.3 L liquid medium (l/2xMS, 2% sucrose). The seeds were germinated 
under continuous illumination for 6 days, thereafter the medium was replaced and the 
seedlings put into the dark for another 7 days in order to induce plant senescence. The PHA 
was extracted from the plant material as above and one methanol precipitation was 
performed to purify the PHA. 23 mg of PHA was obtained from 14.3 g dry plant material, 
which related to 0.16 % weight/dry weight. It was determined that ^69 % of the PHA had 
remained in the plant material after the chloroform extraction, which could be due to either 
the high content of C8:l monomer (see Table 5) causing the polymer to "stick", or due to 
moisture in the ground material which had not been lyophilized completely, or due to the 
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large sample size for which a longer and more efficient chloroform extraction might have - * 
been required. The purification of native PHA and analysis by GC-MS allowed the 
detection of several more peaks that could not be initially resolved in crude extracts because 
of the high level of noise in the chromatogram. A total of eighteen 3-hydroxyacid 

5 monomers could be detected in the polymer (Table 1). In addtion to 3-hydroxyhexanoic acid 
(H:6), 3-hydroxyoctanoic acid (H:8), 3 -hydroxy decanoic acid (H:10), 3-hydroxydodecanoic 
acid (H:12), 3-hydroxytetradecanoic acid (H:14) and 3 -hydroxy octenoic acid (H8:l) 
monomers previously detected in the transesterification of intact plant material (crude 
extract) (Table 1), novel saturated and unsaturated monomers were detected which include 

10 3-hydroxyhexadecanoic acid (H:16), 3-hydroxynonanoic acid (H9), 3 -hydroxy undecanoic > 
acid (H:ll), 3-hydroxytridecanoic acid (H:13), 3-hydroxyhexadecatrienoic acid (H16:3), 3- 
hydroxyhexadecadienoic acid (H16:2), 3-hydroxyhexadecenoic acid (H16:l), 3- 
hydroxytetradecatrienoic acid (HI 4:3), 3-hydroxytetradecadienoic acid (H14:2), 3- 
hydroxytetradecenoic acid (H14:l), 3-hydroxydodecadienoic acid (H12:2) and 3- 

15 hydroxydodecenoic acid (HI 2:1). All even-chained monomers could be quantified and 
results are shown in Table 5. 

- 

It is expected that many of the unidentified minor peaks detected in the PHA 
purified from the TWEEN-20-fed seedlings would correspond to some of the minor 
20 saturated and unsaturated monomer detected in the "native" PHA. 

< 

Table 4. Comparison of the monomer composition of purified high-molecular weight PHA 
from Tween-20 feed plants with results obtained for transesterified intact seedlings during 
the preliminary feeding studies 

25 




a Integrated area on the chromatogram. 
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Table 5. Monomer composition of "native" PHA isolated from /tfroCV-transformed plant 
line 3.3 grown in liquid media 3 
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H16^3T~ 


% (w/w) 


4.2 


6.7 


7.5 


11 


2.0 


2.0 


5.6 


|| std. dev. 


1.1 


2.3 


1.4 


3.2 


0.26 


0.41 


1.3 



a Quantification of methyl esters was performed with a GC with a FID detector. Values 
were obtained from four separate PHA preparations. Monomers present in trace amounts 
(H9, H: 1 1 , H: 1 3, HI 6: 1 ) were not quantified. 



EXAMPLE 9: Chemical characterization of high-molecular weight plant PHA 

Purified TWEEN-20-derived PHA (13 mg) and unmodified PHA (5 mg) were given 
to G6raldine Coullerez at the EPFL (collaboration IBPV-EPFL) for the physico-chemical 
characterization of the polymer. Two different samples of bacterial PHA, PHA1 and 
PHOE, were obtained from Witholt and Kellerhals (ETH Ztirich) to be used as controls. 
PHA1 contained predominantly H6 and H8 monomers (10% and 90%, respectively), while 
PHOE contained 4-10% H8:l, the balance being H6 and H8. The molecular weights and 
the respective dispersion coefficients of the polymers were determined by gel permeation 
chromatography (see Table 6). Polystyrene polymers were used as molecular weight 
standards. The results clearly show that the TWEEN-20 derived PHA produced by the 
transgenic plants is in the form of a high-M r polymer (about 200-250 monomers), although 
the molecular weight is only 20-25% of the bacterial polymers (about 1000 monomers). 
This shorter polymer length can be explained by an overabundance of PHA synthase 
relative to its substrate concentration and similar results have also been obtained in in vitro 
polymerization assays with purified PHB synthase (Jun Sim, S. et al., Nature Biotechnology 
15: 63-67 (1997)). It is also possible that PHA polymers with longer chain lengths are 
trapped in the plant material, since a significant proportion of the PHA seems to be 
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recalcitrant to chloroform extraction (> 50%, difficult to determine exact amounts in the 
trans-esterification of intact or powderized plant material, see above). 

NMR analysis of the plant and bacterial PH As revealed, that the TWEEN-20 derived 
plant PHA had the same structure as the bacterial PHA. The NMR spectrum of the 
5 unmodified plant PHA showed the peaks characteristic for the PHA polymer backbone, as 
well as several other peaks which have not been properly assigned or identified at this stage, 
but which could be due to various unsaturated bonds in the side chains of the polymer. 

Table 6. Comparison of molecular weights of high-Mr PHA mcl purified from plants and • 
bacteria 









mmmm 


DispersioSl 

mmmm 


| TWEEN^20 derived 
PHAC1#3.3 


PHA - Arabidopsis 


4.01 x 10 4 


4910 


8.17 | 


PHA1 from Pseudomonas oleovorans 


1.46 x 10 5 


58850 


2.48 


\ PHOE from P. oleovorans 


2.0 x 10 s 


81590 


2.44 | 



•I V • * 

EXAMPLE 10: The multifunctional protein (MFP^ from the veast Candida 
tro picalis 



In animals, plants and bacteria, {5-oxidation has been shown to proceed via the L- 
isomer of the 3-hydroxy-acyl-CoA intermediates and any D-isomers (which are predicted to 

15 arise in the degradation of fatty acids containing cis-unsaturated bonds at even-numbered 
carbons) have to be converted to the L-form in order to be oxidized further by the 
dehydrogenase activity of the multifunctional protein (MFP). In yeast the ^-oxidation was 
reported to proceed via the D-isomer (Nuttley, W. M. et aL, Gene 69: 171-180 (1988); 
Hiltunen, J. K. et aL, J. Biol. Chem. 267: 6646-6653 (1992); Fossi, A. et aL, Mol. Gen. 

20 Genet. 247: 95-104 (1995)), The yeast multifunctional protein (MFP) was shown to contain 
enoyl-CoA hydratase II and D-3-hydroxyacyl-CoA dehydrogenase activities, which together 
converted trans-2-enoyI-CoA via D-3-hydroxyacyl-CoA to 3-ketoacyl-CoA, i.e. the D- 
isomer was directly utilized by the dehydrogenase without prior conversion to the L-form. 
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It is anticipated that expression of this hydratase II activity together with the PHA synthase 
in the peroxisomes of double-transgenic plants will generate more of the D-3-hydroxy-acyl- 
CoA intermediates for their incorporation by the PHA synthase into the PHA polymer, thus 
increasing the final yield of PHA. Four separate approaches are envisioned. 

A. Expression of the unchanged MFP from C. trovicalis in A, thaliana. 

Since the hydratase II activity forms part of the MFP it was decided to perform 
investigatory experiments with the complete MFP prior to attempting to abolish the D-3- 
hydroxyacyl-CoA dehydrogenase activity. As the fungal MFP already had a peroxisomal 
targeting signal, this protein was expected also to be targeted to the plant peroxisomes. 

The C. tropicalis MFP cDNA (Nuttley, W. M. et al., Gene 69: 171-180 (1988), 
GenBank Accession Number M22765) was cloned via PGR amplification (SEQ ID NO:21, 
encoding SEQ ID NO:22) into pART7 to obtain pART7_MFP. The Notl-cassette, 
containing the CAMV35S-promoter in front of the MFP gene and the ocs3'- terminator, was 
inserted into the plant binary vector pART27 to obtain pART27JMFP, which was 
transformed into Arabidopsis. Transgenic plant were selected on kanamycin and screened 
for the expression of the MFP protein with an anti-MFP antiserum. Homozygous T2 plants 
were cross-fertilized with PHAC1 #3, PHAC1#4 and PHAC1#9 plants. Offspring from 
these crosses will be analyzed for their ability to biosynthesize PHA. 

B. Changing the peroxisomal targeting signal of the veast multifunctional protein 
(MFP^ from -AKI to -SKL. 

The COOH-terminal tripeptide -AKI was shown to be responsible for peroxisomal 
targeting of the MFP in yeast, but it has not yet been demonstrated to function in plant 
peroxisomal targeting. The MFP.SKL gene, in which the 3'-terminal nucleotide sequence of 
the MFP gene encoding the -AKI tripeptide had been changed to -SKL by PCR site-directed 
mutagenesis (SEQ ID NO:23, encoding SEQ ID NO:24), was obtained from the laboratory 
of K. Hiltunen to ascertain that the MFP was properly targeted to the plant peroxisomes and 
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to serve as a positive control in targeting studies with the yeast multifunctional protein - 
(MFP) in plant cells. The MFP.SKL gene was used to construct pART7_MFP.SKL. The 
Notl-cassette of pART7_MFP.SKL, containing the MFP-SKL gene flanked by the 
CaMV35S promoter and the ocs3 -terminator, was cloned into pART27 to obtain 
pART27_MFP.SKL, which was transformed into A. thaliana ecotype Columbia. 
Kanamycin resistant Tl plants were obtained. The high-MFP.SKL-expressing lines will be 
selected by Western analysis of T2 plants, and the selected lines will be crossed with 
PHAC1 #3.3 plants. 

C. Deleting the peroxisomal targeting signal of the yeast multifunctional protein i 

(MFP). 

The construct pART7_MFPAAKI was obtained by PCR amplification of the MFP 
gene such that the 3' -terminal nucleotide sequence of the MFP gene encoding the -AKI 
tripeptide was deleted by the introduction of a stop codon (SEQ ID NO:25, encoding SEQ 
ID NO:26). The "detargeted" MFPAAKI is expected to be localized in the cytoplasm and 
will be utilized as negative control in experiments to study the localization of MFP and 
MFP.SKL in plant cells, pART27_MFPAAKI was transformed into A. thaliana ecotype 
Columbia and Kanamycin resistant Tl plants were obtained. The high-MFPAAKI- 
expressing lines will be selected by Western analysis of T2 plants and these lines will be 
crossed with PHAC1#3.3 plants. ' 

D. Deleting the dehydrogenase activity of the veast multifunctional protein (MFP). 

As only the hydratase II activity of the yeast multifunctional protein (MFP) is of 
interest, plants will be transformed with the MFPADH gene, in which the dehydrogenase 
activity was deleted by site-directed mutagenesis of specific amino acid residues identified 
as being essential for this activity. 

EXAMPLE 11: Verification of enzyme activity of modified MFP constructs in 

Pichia 
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The modified MFP.SKL and MFPAAKI genes were subcloned from 
pART7._MFP.SKL and pART7_MFPAAKI into the yeast expression vector pHILD2. The 
resulting plasmids pHILD2_MFP.SKL and pHILD2_MFPAAKI were transformed into 
Pichia and enzyme assays were performed in Hiltunen's laboratory. Results indicated that 
the modifications to the genes did not have an effect on the dehydrogenase and the 
hydratase enzymatic activities. 

EXAMPLE 12: Expression of the FatB3 acvl-ACP thioesterase in double 
transgenics to increase PHA yield 

Expresion of the California bay acyl-ACP thioesterase was shown to cause 
premature termination of fatty acid elongation during fatty acid biosynthesis in transgenic 
oilseed plants (Voelker, T. A. et ah, Science 257: 72-74 (1992)). The resulting medium- 
chain-length fatty acids were found to accumulate in the triglycerides of seed lipids, but 
could not be detected in leaves. It is thought that medium chain fatty acids do not 
accumulate in the leaves of transgenic plants because they get degraded immediately by P- 
oxidation (Eccleston, V. S. et al., Planta 198: 46-53 (1996)). This increased flux of 
medium-chain fatty acids through p-oxidation may be exploited to improve the yield of 
PHA, as well as to modify the composition of the polymer towards saturated H6-H14 
monomers in double transgenic plants expressing both acyl-ACP thioesterase and the 
PHAC1 synthase. 

The plasmid pBJ49_FatB3 containing the Cuphea lancolata thioesterase FatB3 gene 
under control of a 200 bp minimal promoter derived from the 35S promoter was infiltrated 
into the A. thaliana PHAC1#3.3 transgenic line which is homozygous for the PHAC1 gene. 
Hygromycin resistant lines where obtained and the seed lipid content of Tl seeds was 
analysed for increased levels of medium chain length fatty acids and 11 separate lines 
expressing high levels of the acyl-ACP thioesterase were identified in this manner. 
Subsequently the polyhydroxyalkanoate content of leaves from soil grown T2 double 
transgenic offspring was determined by GC and GC-MS analysis of the 3-hydroxy-fatty 
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acid methyl esters obtained by transesterification of whole leaves. The results (Table 7) 
indicated an approximate tenfold increase in the polyhydroxyalkanoate content of leaves 
from double transgenic plants when compared to plants expressing only the PHAC1 
synthase. The increased polyhydroxyalkanoate yield was mainly due to a large increase in 
the content of the saturated polyhydroxyalkanoate monomers with an even number of 
carbons, namely 3-OH-octanoate (H8), 3-OH-decanoate (H10), 3-OH-dodecanoate (H12) 
and 3-OH-tetradecanoate (H14) (Table 8). 

* 

The recombinant FatB3 acyl-ACP thioesterase is naturally targeted to the 
chloroplast, where it removes medium chain-length acyl-ACP intermediates from the fatty 
acid biosynthesis. These short chain fatty acids accumulate in the seed lipids, but not in the 
leaves of transgenic plants and it has been speculated, that they are immediately degraded 
by P-oxidation. Results with these double transgenic plants indicate that there is indeed an 
increase in the P-oxidation of medium chain length fatty acids in the leaves, which results in 

■ * 

a higher yield of polyhydroxyalkanoate due to the incorporation of the P-oxidation 
intermediates into the PHA by the polyhydroxyalkanoate synthase. 



Table 7. PHA content of leaves from single and double transgenic plants expressing the 
PHAC1 synthase alone or together with the FatB3 acyl-ACP thioesterase 





■WPP 








llliliillll 




PHAC1#3.3 plant 1 


0.0040 






PHAC1#3.3 plant 2 


0.0253 


0.0147 


0.015 


PHAC1#3.3 + FatB3 line 2.4a plant 2 


0.1281 






PHAC1#3.3 + FatB3 line 2.4b plant 1 


0.0749 


0.1175 


0.038 


PHAC1#3.3 + FatB3 line 2.4b plant 5 


0.1495 


■ 
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Table 8. PHA content of leaves from single and double transgenic plants expressing the 
PHAC1 synthase alone or together with the FatB3 acyl-ACP thioesterase 
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:a?HACi#3l3%J?&t^ . 








a/0..- LyVOKJ^*'*:- 








I H6 


U.UUUJD 


U.UUUJO 


2.3V 


0.00455 


0.00131 


^ **** y 

3.87 1 


I H8:l 


0.00451 


0 00525 


30 76 


0 00790 


0 00913 


f% 71 


|H8 


0.00205 


0.00201 


13.94 


0.03765 


0.01 120 


32.05 


Ih9 


0.00014 


0.00001 


0.95 


0.00029 


0.00010 


0.25 


Ihio 


0.00087 


0.00080 


5.91 


0.04694 


0.01816 


39.96 


1 HU 


0.00017 


0.00002 


1.17 


0.00034 


0.00015 


0.29 


Ih12 


0.00145 


0.00145 


9.87 


0.00642 


0.00247 


5.47 


|H13 


0.00016 


0.00010 


1.09 


0.00023 


0.00013 


0.20 


H14:l 


0.00072 


0.00059 


4.92 


0.00141 


0.00114 


1.20 


H14:2 


0.00121 


0.00142 


8.21 


0.00179 


0.00209 


1.53 


|H14:3 


0.00086 


0.00106 


5.86 


0.00142 


0.00178 


1.20 I 


|H14 


0.00219 


0.00222 


14.93 


0.00853 


0.00459 


7.26 1 



EXAMPLE 13: Crossing PHAC1#3.3 transgenic plants with fatty acvl hydroxylase 
LFahl2 transgenic plants 

» ** 

Three lines of transgenic A, thaliana expressing the LFahl2 fatty acyl hydroxylase 
gene from Lesquerella were obtained from Pierre Broun (Chris Somerville's laboratory, 
Carnegie Institution, Stanford, CA). This fatty acyl hydroxylase is responsible for the 
production of ricinoleic acid (CI 8:1 ; 9-cis, D-12-hydroxy) in Lesquerella. It was found that 
hydroxylated fatty acids accumulated in the seed triglycerides of Arabidopsis, but not in the 
leaves, again indicating that hydroxylated fatty acids synthesized in leaves are most likely 
degraded by p-oxidation (Broun, P. and Somerville, C, Plant Physiol. 113: 933-942 (1997); 
van de Loo, F.N. et aL, Proc. Natl. Acad Set U.S.A. 92: 6743-6747 (1995)). Crosses were 
made with the three fatty acyl hydroxylase transgenic lines and the PHAC1#3.3 line and the 
seeds of these crosses were harvested. Seeds and their progeny plants will be examined for 
their levels of PHA biosynthesis. The aim of this experiment is to investigate if the 
increased flux of hydroxylated fatty acids to the p-oxidation cycle in transgenic plants 
expressing the Fah 12 and PHA synthase genes can lead to an increase in the yield of PHA 
and if novel hydroxylated monomers can be incorporated in the PHA. 
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EXAMPLE 14: Influence of carbon source and light conditions on PHA synthesis 



The amount of PHA present in plant tissues was influenced by the growth conditions 
. For plants grown for three weeks under constant illumination in MS liquid media with 2% 
sucrose, the yield of PHA was approximately 0.6 mg/g dry weight (dwt). Removal of 
sucrose for the last week of growth in the light resulted in a 1 00% increase in PHA, while 
plants growing in 2% sucrose but shifted in the dark for the last week accumulated 22% 
more PHA (Table 9). 

Table 9. Influence of sucrose and light on PHA accumulation in phaCl -transformed line 3.3 






0% sucrose 


0.2% 
sucrose 


2% sucrose 


0% sucrose 


0.2% 
sucrose 


2% sucrc 




dark 


dark 


dark 


light 


light 


light 


mg PHA/g dwt 


1.42 


1.31 


0.73 


1.23 


1.08 


0.60 


Relative % b 


100 


92 


52 


87 


76 


42 



a 



Seedlings were grown under constant illumination in a liquid medium containing MS salts 
and 2% (w/v) sucrose for 2 weeks, and then grown for another week, either in the dark or in 
the light, in media containing different concentrations of sucrose. 
b The yield of 1 .42 mg/g dry weight was arbitrarily defined as 1 00%. 



EXAMPLE 15: Peroxisome targeting 



It has been shown in multiple sytems (e.g., yeast, animal, and plants) that targeting 
of proteins to the peroxisome can be acheived by the addition of as little as three amino 
acids at the carboxy end of a foreign protein (see Gietl, C, Physiol Plant 97: 599-608 
(1996); Purdue, P.E. and Lazarow, P. B., J. Biol Chem. 269: 30065-30068 (1994); 
Subramani, Anru Rev. Cell Biol 9 9: 445-478 (1993)). The minimal consensus sequence for 
peroxisome targeting of protein via the carboxy end, named PTS 1 for peroxisomal targeting 
sequence 1 , is a small uncharged amino acid at position 1 (S, A, or P), a positively-charged 



-42 



WO 99/35278 PCT/US9 8/00083 

amino acids at position 2 (K, R, S, or H), and a hydrophobic amino acid at position 3 (L, M, 
I or F). 

Thus, although the initial minimal PTS 1 sequence was defined as SKL, a range of 
substition have been found to be effective PTS 1 signal, including ARM, SRM, SKL, ARL, 
SRL, PSI, or PRM. Specific examples of targeting of foreign proteins in plants include: 6 
amino acid PTS1 (RAVARL, Volokita, M., Plant JA: 361-366 (1991)); 5 amino acids PTS1 
(AKSRM, Olsen, L. J. et al, Plant Cell 5: 941-952 (1993)); 4 amino acids PTS1 (KSRM, 
Trelease, R. N. et al., Protoplasma 195: 156-167 (1996)); 5 amino acid PTS1 (ELSRL, 
Hayashi, M et al, Plant J. 10: 225-234 (1996)); 4 amino acid PST1 (RPSI, Mullen R. T. et 
al, Plant J. 12: 313-322 (1997)); 3 amino acid PTS1 (SKL, Banjoko, A. et si., Plant Physiol 
107: 1201-1208 (1995)); 3 amino acid PTS1 (ARM, Lee, M.S. et al., Plant Cell 8: 185-197 
(1997)). 

A comparison of the peroxisomal targeting sequence 1 (PTS1) found in mammals, 
fungi and trypanosomes was performed by Purdue, P.E. and Lazarow, P.B. (J. Biol. Chem. 
269: 30065-30068 (1994). All sequences shown in Table 10 are functional in at least one 
species. Other sequences may or may not have been tested. For trypanosomes, all 
sequences with a single amino acid change from SKL that are not shown are nonfunctional. 
The asterisks refer to the fact that -NKL and -SQL (outside the mammalian consensus, but 
not directly tested) have been found at the C termini of mammalian peroxisomal proteins. 
Uppercase, functional; lowercase, nonfunctional; underlined, not yet found on a 
peroxisomal protein in that species. 
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Table 1 0. C-terminal peroxysomal targeting sequences. 



1 Mammals 
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5 cerevisiae 
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SKI 


olvr 




'■ 




1 ski 




SKI 




SKI 






NKL 




NKL 


L _ — 




a ■rat** 

ARF 






L_ — 


AKI 




AKI 






fail 










pki 




GKI 




ssl 








SSL 










SKM 


fid 








G/H/P/T-KL 










S-M/N/O-L 










SKY 



I 



The minimal peroxisomal targeting sequence 1 (PTS1) in plants has been found to 
be ARM, SRM, SKL, ARL, SRL, PSI, and PRM (Compilation from Volokita, M., Plant J., 
1: 361-366 (1991); Olsen, LJ. et al., Plant Cell, 5: 941-952 (1993); Trelease, R.N. et al., 
5 Protoplasma, 195: 156-167 (1996); Gietl, C, Physiol. Plant., 97: 599-608 (1996); Purdue, 
P.E. and Lazarow, P.B., J. Biol. Chem., 269: 30065-30068 (1994); Subramani, Ann. Rev. 
Cell Biol., 9:445-478 (1993); Mullen, R.T., et al., Plant J., 12: 313-322 (1997); Lee, M.S., 
et al., Plant Cell, 9: 1 85-1 97 (1 997)). 

Some proteins are targeted to the peroxisome via an N-termianl extension called 
10 PTS2 for peroxisome targeting sequence 2. In this case, a consensus sequnce of nine amino 
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acids has been defined, being (R/K)(L/Q/I)XXXXX(H/Q)L. Foreign protein (eg p. 
glucuronidase) can also be targeted in plants to the peroxisome by adding a PTS2 sequence 
at the N-terminal end of the protein (Kato et al, Plant Cell 8: 1601-161 1 (1996)). 

EXAMPLE 16: Co-expression of PHA with other sequences resulting in increased 
or novel PHA biosynthesis 

PHA mcJ synthesized in transgenic plants can include a large variety of monomers, 
with functional groups that can be used to modify and improve the characteristics of the 
polymer before or after extraction form the plant. For example, the presence of double 
bonds, epoxy groups, or acetylated groups within the PHA may be used to cross-link the 
polymer. The examples herein have demonstrated the incorporation of the following range 
of monomers into plant PHAmcl: even-chain saturated 3-OH-acyl monomers with six to 
sixteen carbons; odd-chain saturated 3-OH-acyl monomers with seven to thirteen carbons; 
unsaturated 3-OH-acyl monomer with 8, 12, 14, and 16 carbons and with 1, 2, or 3 double 
bonds; branched-chain 3-OH-acyl monomers (8-methyl-3-D-hydroxy-nonanoic acid and 6- 
methyl-3-D-hydroxy-heptanoic acid) and 4-OH-acyl monomers (D-4-hydroxy-decanoate). 
Although in these experiments some monomers, such as branched-chain, odd-chain or 
hydroxylated 3-hydroxyacids, were found included in PHAs after exogenous fatty acids 
were supplied to the transgenic plants, the same range of monomers would also be included 
in plant PHA from fatty acids supplied from endogenous fatty acid synthesis. Thus, one can 
predict being able to synthesize PHA polymers in plants that have a wide range of 
monomers, for example, higher proportion of short-chain monomers, unsaturated bonds at 
novel positions, monomers with hydroxylated groups, epoxy groups, acetylated groups, keto 
groups, cyclopentenyl groups, cyclopropanoid groups, furanoid groups or halogenated 
groups, branched chain, cyclic groups or any other novel monomers for which the 
equivalent functional groups exist in fatty acids in plants. The incorporation of these novel 
monomers derived from fatty acids into plant PHAs could be accomplished by expressing a 
PHA synthase in a plant which synthesizes these unusual fatty acids either naturally or after 
expression of a transgene such as fatty-acyl-thioesterases, -hydroxylases, -desaturases, - 
epoxidases, or -acetylases. 
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It is also conceivable that the substrate specificity of the PHA synthase could be 
modified to allow the incorporation of a wider range of monomers into PHA. One can 
predict that the range of monomers which could be included into plant PHAs from such a 
modified PHA synthase will include monomers that can be derived from plant fatty acid 
metabolism found in wild type plants or plants expressing transgenes (such as desaturases, 
hydroxylases, thioesterases, epoxydases, acetylases) which results in the modification of 
fatty acids synthesized in plants. It is also conceivable that suitable hydroxy acid substrates 

j 

for the PHA synthase can be obtained from the amino acid metabolism or the plant 
secondary metabolism. 

It has been demonstrated before that plants can synthesize PHB from acetyl-CoA 
through the expression of the 3-ketothiolase, acetoacetyl-CoA reductase and PHB synthase 
from A. eutrophus (Poirier, Y. et al., Science 256: 520-523 (1992); Nawrath, C. et al., Proc. 
Natl. Acad. Sci. U.S.A. 91: 12760-12764 (1994)). The examples herein demonstrate that 
PHA mc | can be synthesized in plants expressing a PHA synthase which can accept 
monomers from H6-HI6. Since acetyl-CoA is also found in the peroxisome, one can 
predict that co-expression of a PHA synthase with a substrate specificity for 3-hydroxyacids 
ranging from H4 to H8 or higher in the peroxisome, and of the A. eutrophus acetoacetyl- 
CoA reductase, would lead to the biosynthesis of a copolymer containing hydroxy butyrate 
and hydroxyacids of H6 and higher. In this pathway, the expression of the 3-ketothiolase 
from A. eutrophus may not be required since the peroxisome already contains a 3- 
ketothiolase. 

■ 

The examples herein clearly show that synthesis of PHA in plants can be 
significantly enhanced by increasing the pool of fatty acids which is channeled through (J- 
oxidation. Thus, when short-chain fatty acids were added externally in the form of 
TWEEN-20 to PHAC1 -transgenic plants, there was a 30- fold increase in the amount of 
PHA synthesized in plants. Similar large increases in PHA synthesis were found when 
tridecanoic acid and 8-methyl-nonanoic acid were added to the growth media. It is 
hypothesized that because these fatty acids could not be incorporated into membranes 
without disrupting them, the fatty acids are detoxified by channeling them to the peroxisome 
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for degradation by the P-oxidation cycle. Thus, increased channeling of fatty acids to the p- 
oxidation cycle results in an increase in PHA synthesized using intermediates of fatty acid 
oxidation. One can predict from this work that any changes in plants which results in an 
increased flux of fatty acids to the p-oxidation cycle will results in an increase in PHA 
synthesis in plants expressing a PHA synthase targeted to the peroxisome. Increasing the 
flux of fatty acids to the p-oxidation cycle could be accomplished by overexpressing 
enzymes which lead to the biosynthesis of modified fatty acids. This has been demonstrated 
in plants expressing thioesterase (Eccleston, V.S. et al, Planta 198: 46-53 (1996)) and 
implied in plants expressing hydroxylase (van de Loo, F.N. et al., Proc. Natl Acad. Set 
U.S.A. 92: 6743-6747 (1995)). Increase of flux of lipids to the p-oxidation cycle and to 
PHA synthesis could also be accomplished by expressing other fatty acid modifying 
enzymes, such as desaturases, epoxydases, acetylases, enzymes involved in synthesis of 
branched-chain fatty acids, etcetera. This concept has been directly demonstrated in this 
present work with a fatty acyl-ACP thioesterase. It was shown that co-expression of a fatty 
acyl-ACP thioesterase in a plant expressing a peroxisomal PHA synthase leads to a 10 fold 
increase in PHA (Table 7). In addition of increasing the amount of PHA in plants 
expression of the thioesterase leads to a predictable change in the composition of the PHA, 
i.e. since the C. lanceolata FatB3 thioesterase has the highest affinity for saturated CIO fatty 
acyl-ACP, there is a corresponding large increase in hydroxydecanoic acid (H10) present in 
the plant PHA (Table 8). Thus, expression of fatty acid modifying enzymes in conjunction 
with a PHA synthase in plants not only leads to an increase in the amount of PHA 
synthesized in plants, but also leads to a predictable changes in the PHA monomer 
composition, e.g. co-expression of a short-chain fatty acyl-ACP thioesterase would lead to 
an increase in the proportion of short-chain hydroxyacid monomers in plant PHA, co- 
expression of a long-chain fatty acyl-ACP thioesterase would lead to an increase in the 
proportion of long-chain hydroxyacid monomers in plant PHA, co-expression of a fatty acyl 
hydroxylase would lead to an increase in the proportion of hydroxylated hydroxyacid 
monomers in plant PHA, co-expression of a fatty acyl epoxidase would lead to an increase 
in the proportion of epoxidated monomers in plant PHA, co-expression of a fatty acyl 
acetylase would lead to an increase in the proportion of acetylated hydroxyacid monomers 
in plant PHA, and co-expression of a fatty acyl desaturase would lead to an increase in the 
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proportion of unsaturated hydroxyacid monomers in plant PHA. Increase in flux of lipids 
through the p-oxidation cycle could also be accomplished by overexpressing the key" 
regulators (i.e. transcriptional factors) involved in the up-regulation of the entire p-oxidation 
cycle pathway during germination or senescence. This last approach would have the 
advantage of turning-on the P-oxidation cycle in tissues which normally have only low 
activity ^ such as the developing seeds of oil crops. 

The examples herein point out the impact of fatty acid modifying enzymes for the 
production of novel PHA in transgenic plants expressing a PHA synthase. One key enzyme 
appears to be a 3-hydroxy-acyl-CoA epimerase. Although the normal function of the i 
epimerase is to convert D-3-hydroxy-acyl-CoAs to the L-form required for the action of the 
L-3-hydroxy-acyl-CoA dehydrogenase, the reverse reaction of the epimerase can be 
responsible for converting the L-form to the D-form, which is essential for the activity of 
the PHA synthase. For that purpose the epimerase is important for the supply of the 
substrates for the PHA synthase derived from p-oxidation in the peroxisomes. Recombinant 
forms of such an epimerase activity expressed in peroxisomes or in other plant cell 
compartments like the cytoplasm or the plastids could play an important role in the 
production of PHA in transgenic plants. It is possible that the slow rate of the epimerase 
"reverse reaction" could be the major factor limiting the supply of substrates for the PHA 
synthase. The substrate limitation due to this could be the reason why PHA synthesis 

i 

seemed to have reached a maximum in seedlings germinated both in the light and in the 
dark in liquid medium supplemented with TWEEN-20, which contains only saturated fatty 
acids. 

The importance of certain fatty acid desaturases is highlighted by Table 3, wherein 
petroselinic acid (CI 8:1, 6-cis) was supplied to germinating PHAC1#3.3 seedlings in liquid 
medium, resulting in the specific increase of the HI 4 monomer. This indicated that any 
fatty acid containing unsaturated bonds starting at even-numbered carbons directly gives 
rise to the appropriate D-3-hydroxy-acyl-CoAs during p-oxidation, thus bypassing the 
otherwise necessary "reverse reaction" of the epimerase to generate the D-intermediates. 
Similarly the H8 and the H8:l monomer are predicted to originate from the unsaturated fatty 
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acids linoleic acid (C18:2, 9,12-all cis) and linolenic acid (C18:3, 9,12,15-all cis). For that 
reason any plant containing high levels of fatty acids with unsaturated bonds starting at 
even-numbered carbons could be of interest for the production of PHA mcl , or the transgenic 
expression of suitable fatty acid desaturases producing such unsaturated fatty acids in plants 
containing the PHA synthase would be similarly attractive for PHA production and 
monomer manipulation. 

The examples herein demonstrate that a peroxisomally-located PHA synthase is able 
to divert intermediates from (5-oxidation for their incorporation into PHA. The existence of 
the required D-3-hydroxy-acyl-CoA substrates was important for the synthesis of PHA. In 
light of the present disclosure, one may predict that PHA can be produced in a similar 
manner in any other compartment of any plant cell, provided that a supply of such D-3- 
hydroxy-acyl-CoA intermediates is present due either to an endogenous metabolic pathway 
or due to an artificially created pathway utilizing expression of transgenes. Fatty acid 
biosynthesis occurs in the plastids in plant cells, and modifications of this pathway could 
turn the plastids into a suitable source of D-3-hydroxy-acyl-CoA intermediates, which could 
subsequently be used to produce PHA either in the plastid itself or in other cell 
compartments. 

EXAMPLE 17: Protein analysis 

Leaves from transgenic plants were homogenized in 200 mM Tris-HCl (pH 7.5), 250 
mM EDTA, 5 mM dithiothreitol and 1 mM phenylmethylsulfonyl fluoride. The 
homogenate was clarified by centrifugation and protein analyzed by Western blot using the 
ECL detection system (Amersham, Arlington Heights, IL). 

EXAMPLE 18: Immunolocalization 

Transgenic plants were grown on media containing MS salts, 1% sucrose, 0.7^ agar 
and 50 ng/mL kanamycin for either 7 days in the light or 1 day in the light followed by 6 
days in the dark. Whole plants were fixed for 2 hours at room temperature in 4% 
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formaldehyde, 0.5% glutaraldehyde, 50 mM sodium cacodylate pH 7.3. The tissue samples 
were dehydrated in an ethanol series and embedded in LR White resin. Ultra thin sections - 
were cut using a microtome, mounted on formvar-coated gold grids and blocked in 0.8% 
(w/v) bovine serum albumin, 0.1% (w/v) gelatine, 5% (w/v) normal goat serum and 2 mM 
sodium azide in PBS (10 mM sodium phosphate, 150 mM sodium chloride, pH 7.4). Grids 
were incubated for 1 hour at room temperature with antiserum against PHA synthase (1:50), 
glycolate oxidase (1:2000) and isocitrate lyase (1 : 1000) in the blocking solution followed by 
a 4 hour incubation at room temperature with a 1 :50 dilution of gold-conjugated goat anti- 
rabbit antibodies (15 nm gold particles) in PBS. Immunolabeled sections were doubled- 
stained with uranyl acetate and lead citrate and viewed with a Jeol JEM transmission 
electron microscope. 

EXAMPLE 1 9: PHA extraction and analysis 

Fresh or dried frozen plant material was ground in a mortar and lyophilized. The 
powder was extracted with methanol in a Soxhlet apparatus for 24 hours followed by PHA 
extraction with chloroform for 24 hours, both at 85°C. The PHA-containing chloroform 
was concentrated under reduced pressure and extracted once with water to remove residual 
solid particles. PHA was precipitated by the addition of 10 volumes of cold methanol and 
subsequently washed by two cycles of chloroform solubilisation and methanol precipitation. 
PHA dissolved in chloroform was transesterified by acid methanolysis (Huijberts, G. N. et 
al., AppL Environ. Microbiol 58: 536-544 (1992)) and analyzed by gas-chromatography 
and mass spectrometry (GC-MS) using a Hewlett-Packard 5890 gas chromatograph (30 m 
long HP-5MS column) coupled to a Hewlett-Packard 5972 mass spectrometer (Hewlett 
Packard, Palo Alto, CA). Molecular weight determination of PHA samples were 
determined by gel permeation chromatography on a Waters 150 CV (Waters Corp., Milford, 
MA) equipped with a differential refractive index detector and an on-line viscometer and 
three ultrastyragel columns in series (10 4 , 10 s and 10 6 A). Samples were prepared in 
dichloromethane and calibration performed using polystyrene standards. 



50 



WO 99/35278 

EXAMPLE 20: Plant Vectors 



PCT/US98/00083 



In plants, transformation vectors capable of introducing encoding DNAs involved in 
PHA biosynthesis are easily designed, and generally contain one or more DNA coding 
sequences of interest under the transcriptional control of 5' and 3* regulatory sequences. 
Such vectors generally comprise, operatively linked in sequence in the 5' to 3* direction, a 
promoter sequence that directs the transcription of a downstream heterologous structural 
DNA in a plant; optionally, a 5' non-translated leader sequence; a nucleotide sequence that 
encodes a protein of interest; and a 3 ' non-translated region that encodes a polyadeny lation 
signal which functions in plant cells to cause the termination of transcription and the 
addition of polyadenylate nucleotides to the 3* end of the mRNA encoding said protein. 
Plant transformation vectors also generally contain a selectable marker. Typical 5 '-3' 
regulatory sequences include a transcription initiation start site, a ribosome binding site, an 
RNA processing signal, a transcription termination site, and/or a polyadenylation signal. 
Vectors for plant transformation have been reviewed in Rodriguez et al. (Vectors: A Survey 
of Molecular Cloning Vectors and Their Uses, Butterworths, Boston. (1988)), Glick et al. 
(Methods in Plant Molecular Biology and Biotechnology, CRC Press, Boca Raton, Fla. 
(1993)), and Croy (Plant Molecular Biology Labfax, Hames and Rickwood (Eds.), BIOS 
Scientific Publishers Limited, Oxford, UK. (1 993)). 

EXAMPLE 21 : Plant Promoters 

Plant promoter sequences can be constitutive or inducible, environmentally- or 
developmentally-regulated, or cell- or tissue-specific. Often-used constitutive promoters 
include the CaMV 35S promoter (Odell et al., Nature 313: 810 (1985)), the enhanced 
CaMV 35S promoter, the Figwort Mosaic Virus (FMV) promoter (Richins et al., Nucleic 
Acids Res. 20: 8451 (1987)), the mannopine synthase (mas) promoter, the nopaline synthase 
(nos) promoter, and the octopine synthase (ocs) promoter. Useful inducible promoters 
include promoters induced by salicylic acid or polyacrylic acids (PR-1, Williams , S. W. et 
al, Biotechnology 10: 540-543 (1992)), induced by application of safeners (substituted 
benzenesulfonamide herbicides, Hershey, H.P. and Stoner, T.D., Plant Moh Biol 17: 679- 
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690 (1991)), heat-shock promoters (Ou-Lee et al., Proc. Natl Acad. Sci U.S.A. 83: 6815 
(1986); Ainley et al., Plant Mol Biol. 14: 949 (1990)), a nitrate-inducible promoter derived - 
from the spinach nitrite reductase gene (Back et al., Plant Mol. Biol 17: 9 (1991)), 
hormone-inducible promoters (Yamaguchi-Shinozaki et al., Plant Mol Biol 1 5: 905 (1990); 
Kares et al., Plant Mol Biol. 15: 905 (1990)), and light-inducible promoters associated with 
the small subunit of RuBP carboxylase and LHCP gene families (Kuhlemeier et al., Plant 
Cell 1: 471 (1989); Feinbaum et al., Mol Gen. Genet. 226: 449 (1991); Weisshaar et ah, 
EMBOJ. 10: 1777 (1991); Lam and Chiia, J. Biol Chem. 266: 17131 (1990); Castresana t\ 
al., EMBOJ. 7: 1929 (1988); Schulze-Lefert et al., EMBOJ. 8: 651 (1989)). Examples of 
useful tissue-specific, developmentally-regulated promoters include the p-conglycinin 7S \ 
promoter (Doyle et al., J. Biol Chem. 261: 9228 (1986); Slighton and Beachy, Planta 172: 
356 (1987)), and seed-specific promoters (Knutzon et al., Proc. Natl Acad. Sci U.S.A. 89: 
2624 (1992); Bustos et al., EMBO J. 10: 1469 (1991); Lam and Chua, Science 248: 471 
(1991); Stayton et al., Aust. J. Plant. Physiol. IS: 507 (1991)). Plant functional promoters 
useful for preferential expression in seed plastids include those from plant storage protein 
genes and from genes involved in fatty acid biosynthesis in oilseeds. Examples of such 
promoters include the 5' regulatory regions from such genes as napin (Kridl et al., Seed Sci. 
Res. 1: 209 (1991)), phaseolin, zein, soybean trypsin inhibitor, ACP, stearoyl-ACP 
desaturase, and oleosin. Seed-specific gene regulation is discussed in EP 0 255 378. 
Promoter hybrids can also be constructed to enhance transcriptional activity (Comai, L. and 

( 

Moran, P.M., U.S. Patent No. 5,106,739, issued April 21, 1992), or to combine desired 
transcriptional activity and tissue specificity. 

EXAMPLE 22: Plant transformation and regeneration 

A variety of different methods can be employed to introduce such vectors into plant 
protoplasts, cells, callus tissue, leaf discs, meristems, etcetera, to generate transgenic plants, 
including Agrobacterium-mediated transformation, particle gun delivery, microinjection, 
electroporation, polyethylene glycolmediated protoplast transformation, liposome-mediated 
transformation, etc. (reviewed in Potrykus, Ann. Rev. Plant Physiol Plant Mol. Biol. 42: 
205 (1991)). In general, transgenic plants comprising cells containing and expressing 
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DNAs encoding enzymes facilitating PHA biosynthesis can be produced by transforming 
plant cells with a DNA construct as described above via any of the foregoing methods; 
selecting plant cells that have been transformed on a selective medium; regenerating plant 
cells that have been transformed to produce differentiated plants; and selecting a 
transformed plant which expresses the enzyme-encoding nucleotide sequence. 

Specific methods for transforming a wide variety of dicots and obtaining transgenic 
plants are well documented in the literature (Gasser and Fraley, Science 244: 1293 (1989); 
Fisk and Dandekar, Scientia Horticulturae 55: 5 (1993); Christou, Agro Food Industry Hi 
Tech, p. 1 7 (1 994); and the references cited therein). 

Successful transformation and plant regeneration have been reported in the 
monocots as follows: asparagus (Asparagus officinalis; Bytebier et ah, Proc. Natl Acad 
ScL U.S.A. 84: 5345 (1987)); barley (Hordeum vulgarae; Wan and Lemaux, Plant Physiol. 
104: 37 (1994)); maize (Zea mays; Rhodes et aL, Science 240: 204 (1988); Gordon-Kamm 
et aL, Plant Cell 2: 603 (1990); Fromm et aL, Bio/Technology 8: 833 (1990); Koziel et aL, 
Bio/Technology 11: 194 (1993)); oats (Avena sativa; Somers et al., Bio/Technology 10: 
1589 (1992)); orchardgrass (Dactylis glomerata; Horn et al., Plant Cell Rep. 7: 469 (1988)); 
rice (Oryza sativa, including indica and japonica varieties; Toriyama et aL, Bio/Technology 
6: 10 (1988); Zhang et al., Plant Cell Rep. 7: 379 (1988); Luo and Wu, Plant Mol Biol. 
Rep. 6: 165 (1988); Zhang and Wu, Theor. Appl. Genet. 76: 835 (1988); Christou et al., 
Bio/Technology 9: 957 (1991)); rye (Secale cereale; De la Pena et al., Nature 325: 274 
(1987)); sorghum (Sorghum bicolor; Cassas et al., Proc. Natl. Acad Sci. USA 90: 11212 
(1993)); sugar cane (Saccharum spp.; Bower and Birch, Plant J. 2: 409 (1992)); tall fescue 
(Festuca arundinacea; Wang et al., Bio/Technology 10: 691 (1992)); turfgrass (Agrostis 
palustris; Zhong et al., Plant Cell Rep. 13:1 (1993)); wheat (Triticum aestivumiVasil et al., 
Bio/Technology 10: 667 (1992); Weeks et al., Plant Physiol. 102: 1077 (1993); Becker et 
al., Plant J. 5: 299 (1994)), and alfalfa (Masoud, SJV. et al., Transgen. Res. 5: 313 (1996)). 
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EXAMPLE 23: Host plants 

Particularly useful plants for PHA production include those that produce carbon 
substrates which can be employed for PHA biosynthesis, including tobacco, wheat, potato, 
Arabidopsis, and high oil seed plants such as corn, soybean, canola, oil seed rape, 
sunflower, flax, peanut, sugarcane, switchgrass, and alfalfa. 

If the host plant of choice does hot produce the requisite fatty acid substrates in 
sufficient quantities, it can be modified, for example by mutagenesis or genetic 
transformation, to block or modulate the glycerol ester and fatty acid biosynthesis or 
degradation pathways so that it accumulates the appropriate substrates for PHA production. 
Expression of enzymes such as acyl-ACP thioesterase, fatty acyl hydroxylase, and yeast 
multifunctional protein (MFP) may serve to increase the flux of substrates in the 
peroxisome, leading to higher levels of PHA biosynthesis. 

EXAMPLE 24: Nucleic acid mutation and hybridization 

Variations in the nucleic acid sequence encoding a fusion protein may lead to mutant 
protein sequences that display equivalent or superior enzymatic characteristics when 
compared to the sequences disclosed herein. This invention accordingly encompasses 
nucleic acid sequences which are similar to the sequences disclosed herein, protein 
sequences which are similar to the sequences disclosed herein, and the nucleic acid 
sequences that encode them. Mutations may include deletions, insertions, truncations, 
substitutions, fusions, and the like. 

* 

Mutations to a nucleic acid sequence may be introduced in either a specific or 
random manner, both of which are well known to those of skill in the art of molecular 
biology. A myriad of site-directed mutagenesis techniques exist, typically using 
oligonucleotides to introduce mutations at specific locations in a nucleic acid sequence. 
Examples include single strand rescue (Kunkel, T. Proc. Natl. Acad Sci. U.S.A., 82: 488- 
492 (1985)), unique site elimination (Deng and Nickloff, Anal. Biochem. 200: 81 (1992)), 
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nick protection (Vandeyar, et al. Gene 65: 129-133 (1988)), and PCR (Costa, et al. Methods 
Mol Biol. 57: 31-44 (1996)). Random or non-specific mutations may be generated by 
chemical agents (for a general review, see Singer and Kusmierek, Ann. Rev. Biochem. 52: 
655-693 (1982)) such as nitrosoguanidine (Cerda-Olmedo et al., J. Mol Biol. 33:705-719 
(1968); Guerola, et al. Nature New Biol. 230: 122-125 (1971)) and 2-amiriopurine (Rogan 
and Bessman, J. Bacteriol. 103: 622-633 (1970)), or by biological methods such as passage 
through mutator strains (Greener et al. Mol BiotechnoL 7: 189-195 (1997)). 

Nucleic acid hybridization is a technique well known to those of skill in the art of 
DNA manipulation. The hybridization properties of a given pair of nucleic acids is an 
indication of their similarity or identity. Mutated nucleic acid sequences may be selected 
for their similarity to the disclosed nucleic acid sequences on the basis of their hybridization 
to the disclosed sequences. Low stringency conditions may be used to select sequences 
with multiple mutations. One may wish to employ conditions such as about 0.15 M to 
about 0.9 M sodium chloride, at temperatures ranging from about 20°C to about 55°C. 
High stringency conditions may be used to select for nucleic acid sequences with higher 
degrees of identity to the disclosed sequences. Conditions employed may include about 
0.02 M to about 0.15 M sodium chloride, about 0.5% to about 5% casein, about 0.02% SDS 
and/or about 0.1% N-laurylsarcosine, about 0.001 M to about 0.03 M sodium citrate, at 
temperatures between about 50°C and about 70°C. More preferably, high stringency 
conditions are 0.02 M sodium chloride, 0.5% casein, 0.02% SDS, 0.001 M sodium citrate, 
at a temperature of 50°C. 

EXAMPLE 25: Determination of homologous and degenerate nucleic acid sequences 

Modification and changes may be made in the sequence of the proteins of the 
present invention and the nucleic acid segments which encode them and still obtain a 
functional molecule that encodes a protein with desirable properties. The following is a 
discussion based upon changing the amino acid sequence of a protein to create an 
equivalent, or possibly an improved, second-generation molecule. The amino acid changes 
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of the nucleic acid sequence, according to the „ 



Table 1 1 : Codon degeneracies of amino acids 



IfjAnuno-aciMlii IQniSettepi 


^Three letter ^G6&6ti^^^M^^$^- - 


Alanine 
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Ala 


GCA GCC GCG GCT | 


v_x y o icii it- 
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TGC TGT 1 


Act*\nTtir* 
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VJ 1 ULcUiHC awiu 
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Glu 
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Phe 


TTC TTT I 


y vjiycme 


ri 

VJ 
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GGA GGC GGG GGT I 


B Histidine 


H 


His 


CAC CAT 


1 Isoleucine 


I 


He 


ATA ATC ATT 


| Lysine 


K 


Lys 


AAA AAG I 


|| Leucine 


L 


Leu 


TTA TTG CTA CTC CTG CTT | 


| Methionine 


M 


Met 


ATG 1 


| Asparagine 


N 


Asn 


AAC AAT I 


H Proline 


P 


Pro 


CCA CCC CCG CCT 


H Glutamine 


Q 


Gin 


CAA CAG 


(I Arginine 


R 


Arg 


AGA AGG CGA CGC CGG CGT 


H Serine 


S 


Ser 


AGC AGT TCA TCC TCG TCT j 


\ Threonine 


T 


Thr 


ACA ACC ACG ACT [ 


B Valine 


V 


Val 


GTA GTC GTGGTT | 


H Tryptophan 


w 


Tip 


TGG I 


| Tyrosine 


Y 


Tyr 


TAC TAT | 



Certain amino acids may be substituted for other amino acids in a protein sequenci 
without appreciable loss of enzymatic activity. It is thus contemplated that various changes 
may be made in the peptide sequences of the disclosed protein sequences, or theii 
corresponding nucleic acid sequences without appreciable loss of the biological activity. 



In making such changes, the hydropathic index of amino acids may be considered 
The importance of the hydropathic amino acid index in conferring interactive biologica 
function on a protein is generally understood in the art (Kyle and Doolittle, J: Mol Biol 
157: 105-132 (1982)). It is accepted that the relative hydropathic character of the amin< 
acid contributes to the secondary structure of the resultant protein, which in turn defines th 
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interaction of the protein with other molecules, for example, enzymes, substrates, receptors, 
DNA, antibodies, antigens, and the like. 



Each amino acid has been assigned a hydropathic index on the basis of their 
hydrophobicity and charge characteristics. These are: isoleucine (+4.5); valine (+4.2); 
leucine (+3.8); phenylalanine (+2.8); cysteine/cystine (+2.5); methionine (+1.9); alanine 
(+1.8); glycine (-0.4); threonine (-0.7); serine (-0.8); tryptophan (-0.9); tyrosine (-1.3); 
proline (-1.6); histidine (-3.2); glutamate/glutamine/aspartate/asparagine (-3.5); lysine (- 
3.9); and arginine (-4.5). 

It is known in the art that certain amino acids may be substituted by other amino 
acids having a similar hydropathic index or score and still result in a protein with similar 
biological activity, i.e., still obtain a biologically functional protein. In making such 
changes, the substitution of amino acids whose hydropathic indices are within ±2 is 
preferred, those within ±1 are more preferred, and those within ±0.5 are most preferred. 

It is also understood in the art that the substitution of like amino acids may be made 
effectively on the basis of hydrophilicity. U.S. Patent No. 4,554,101 (Hopp, T.P., issued 
November 19, 1985) states that the greatest local average hydrophilicity of a protein, as 
governed by the hydrophilicity of its adjacent amino acids, correlates with a biological 
property of the protein. The following hydrophilicity values have been assigned to amino 
acids: arginine/lysine (+3.0); aspartate/glutamate (+3.0 ±1); serine (+0.3); 
asparagine/glutamine (+0.2); glycine (0); threonine (-0.4); proline (-0.5 ±1); 
alanine/histidine (-0.5); cysteine (-1.0); methionine (-1.3); valine (-1.5); leucine/isoleucine (- 
1 .8); tyrosine (-2.3); phenylalanine (-2.5); and tryptophan (-3.4). 

It is understood that an amino acid may be substituted by another amino acid having 
a similar hydrophilicity score and still result in a protein with similar biological activity, i.e., 
still obtain a biologically functional protein. In making such changes, the substitution of 
amino acids whose hydropathic indices are within ±2 is preferred, those within ±1 are more 
preferred, and those within ±0.5 are most preferred. 
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As outlined above, amino acid substitutions are therefore based on the relative 
similarity of the amino acid side-chain substituents, for example, their hydrophobicity,' 
hydrophilicity, charge, size, and the like. Exemplary substitutions which take various of the 
foregoing characteristics into consideration are well known to those of skill in the art and 
include: arginine and lysine; glutamate and aspartate; serine and threonine; glutamine and 
asparagine; and valine, leucine, and isoleucine. Changes which are not expected to be 
advantageous may also be used if these resulted in functional fusion proteins. 

All of the compositions and/or methods disclosed and claimed herein can be made 
and executed without undue experimentation in light of the present disclosure. While thej 
compositions and methods of this invention have been described in terms of preferred 
embodiments, it will be apparent to those of skill in the art that variations may be applied to 
the compositions and/or methods and in the steps or in the sequence of steps of the methods 
described herein without departing from the concept, spirit and scope of the invention. 
More specifically, it will be apparent that certain agents which are both chemically and 
physiologically related may be substituted for the agents described herein while the same or 
similar results would be achieved. All such similar substitutes and modifications apparent 
to those skilled in the art are deemed to be within the spirit, scope and concept of the 
invention. 
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SEQUENCE LISTING 

SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: 



(A 
(B 
(C 
(D 
(E 
(F 
(G 
(H 

(A 
(B 
(C 
<D 
(E 
(F 
(G 
(H 



NAME: VOLKER MITTENDORF 

STREET: Institut de Biologie et Physiologie Vegetales 

CITY: Batiment de Biologie 

STATE : Lausanne 

COUNTRY: Switzerland 

POSTAL CODE (ZIP) : CH-1015 

TELEPHONE: (41) (21) 692-4222 

TELEFAX: (41) (21) 692-4195 

NAME: YVES POIRIER 

STREET: Institut de Biologie et Physiologie Vegetales 

CITY: Batiment de Biologie 

STATE : Lausanne 

COUNTRY: Switzerland 

POSTAL CODE (ZIP) : CH-1015 

TELEPHONE: (41) (21) 692-4222 

TELEFAX: (41) (21) 692-4195 



(ii) TITLE OF INVENTION: BIOSYNTHESIS OF MEDIUM CHAIN LENGTH 
POLYHYDROXYALKANOATES 

(iii) NUMBER OF SEQUENCES: 26 



(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1677 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

ATGAGTCAGA AGAACAATAA CGAGCTTCCC AAGCAAGCCG CGGAAAACAC GCTGAACCTG 60 

AATCCGGTGA TCGGCATCCG GGGCAAGGAC CTGCTCACCT CCGCGCGCAT GGTCCTGCTC 120 

CAGGCGGTGC GCCAGCCGCT GCACAGCGCC AGGCACGTGG CGCATTTCAG CCTGGAGCTG 180 
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AAGAACGTCC TGCTCGGCCA GTCGGAGCTA CGCCCAGGCG ATGACGACCG ACGCTTTTCC 24 0 

> 

GATCCGGCCT GGAGCCAGAA TCCACTGTAC AAGCGCTACA TGCAGACCTA CCTGGCCTGG 300 

5 

CGCAAGGAGC TGCACAGCTG GATCAGCCAC AGCGACCTGT CGCCGCAGGA CATCAGTCGT 360 

GGCCAGTTCG TCATCAACCT GCTGACCGAG GCGATGTCGC CGACCAACAG CCTGAGCAAC 420 

10 CCGGCGGCGG TCAAGCGCTT CTTCGAGACC GGCGGCAAGA GCCTGCTGGA CGGCCTCGGC 480 

CACCTGGCCA AGGACCTGGT GAACAACGGC GGGATGCCGA GCCAGGTGGA CATGGACGCC 540 

TTCGAGGTGG GCAAGAACCT GGCCACCACC GAGGGCGCCG TGGTGTTCCG CAACGACGTG 600 

CTGGAACTGA TCCAGTACCG GCCGATCACC GAGTCGGTGC ACGAACGCCC GCTGCTCGTG €60 

GTGCCGCCGC AGATCAACAA GTTCTACGTC TTCGACCTGT CGCCGGACAA GAGCCTGGCG 720 

20 CGCTTCTGCC TGCGCAACGG CGTGCAGACC TTCATCGTCA GTTGGCGCAA CCCGACCAAG 780 

TCGCAGCGCG AATGGGGCCT GACCACCTAT ATCGAGGCGC TCAAGGAGGC CATCGAGGTA 840 

GTCCTGTCGA TCACCGGCAG CAAGGACCTC AACCTCCTCG GCGCCTGCTC CGGCGGGATC 900 

25 

ACCACCGCGA CCCTGGTCGG CCACTACGTG GCCAGCGGCG AGAAGAAGGT CAACGCCTTC 960 

ACCCAACTGG TCAGCGTGCT CGACTTCGAA CTGAATACCC AGGTCGCGCT GTTCGCCGAC 1020 

30 GAGAAGACTC TGGAGGCCGC CAAGCGTCGT TCCTACCAGT CCGGCGTGCT GGAGGGCAAG 1080 

GACATGGCCA AGGTGTTCGC CTGGATGCGC CCCAACGACC TGATCTGGAA CTACTGGGTC 1140 

AACAACTACC TGCTCGGCAA CCAGCCGCCG GCGTTCGACA TCCTCTACTG GAACAACGAC 1200 

35 

ACCACGCGCC TGCCCGCCGC GCTGCACGGC GAGTTCGTCG AACTGTTCAA GAGCAACCCG 1260 

CTGAACCGCC CCGGCGCCCT GGAGGTCTCC GGCACGCCCA TCGACCTGAA GCAGGTGACT 1320 

40 TGCGACTTCT ACTGTGTCGC CGGTCTGAAC GACCACATCA CCCCCTGGGA GTCGTGCTAC 1380 

AAGTCGGCCA GGCTGCTGGG TGGCAAGTGC GAGTTCATCC TCTCCAACAG CGGTCACATC 1440 

CAGAGCATCC TCAACCCACC GGGCAACCCC AAGGCACGCT TCATGACCAA TCCGGAACTG 1500 

45 

CCCGCCGAGC CCAAGGCCTG GCTGGAACAG GCCGGCAAGC ACGCCGACTC GTGGTGGTTG 1560 

CACTGGCAGC AATGGCTGGC CGAACGCTCC GGCAAGACCC GCAAGGCGCC CGCCAGCCTG 1620 

50 GGCAACAAGA CCTATCCGGC CGGCGAAGCC GCGCCCGGAA C CT ACGTGCA TGAACGA 1677 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 559 amino acids 

(B) TYPE: amino acid 
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( C ) STRANDEDNES S : 
(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ser Gin Lys Asn Asn Asn Glu Leu Pro Lys Gin Ala Ala Glu Asn 
15 10 15 

Thr Leu Asn Leu Asn Pro Val lie Gly lie Arg Gly Lys Asp Leu Leu 

20 25 30 

Thr Ser Ala Arg Met Val Leu Leu Gin Ala Val Arg Gin Pro Leu His 
35 40 45 

Ser Ala Arg His Val Ala His Phe Ser Leu Glu Leu Lys Asn Val Leu 
15 50 55 60 

Leu Gly Gin Ser Glu Leu Arg Pro Gly Asp Asp Asp Arg Arg Phe Ser 
65 70 75 80 

20 Asp Pro Ala Trp Ser Gin Asn Pro Leu Tyr Lys Arg Tyr Met Gin Thr 

85 90 95 

Tyr Leu Ala Trp Arg Lys Glu Leu His Ser Trp lie Ser His Ser Asp 

100 105 no 

25 

Leu Ser Pro Gin Asp lie Ser Arg Gly Gin Phe Val lie Asn Leu Leu 
115 120 125 

Thr Glu Ala Met Ser Pro Thr Asn Ser Leu Ser Asn Pro Ala Ala Val 
30 130 135 140 

Lys Arg Phe Phe Glu Thr Gly Gly Lys Ser Leu Leu Asp Gly Leu Gly 
145 150 155 160 

35 His Leu Ala Lys Asp Leu Val Asn Asn Gly Gly Met Pro Ser Gin Val 

165 170 175 

Asp Met Asp Ala Phe Glu Val Gly Lys Asn Leu Ala Thr Thr Glu Gly 

180 185 190 

40 

Ala Val Val Phe Arg Asn Asp Val Leu Glu Leu lie Gin Tyr Arg Pro 
195 200 205 

lie Thr Glu Ser Val His Glu Arg Pro Leu Leu Val Val Pro Pro Gin 
45 210 215 220 

He Asn Lys Phe Tyr Val Phe Asp Leu Ser Pro Asp Lys Ser Leu Ala 
225 230 235 240 



50 



55 



Arg Phe Cys Leu Arg Asn Gly Val Gin Thr Phe He Val Ser Trp Arg 

245 250 255 

Asn Pro Thr Lys Ser Gin Arg Glu Trp Gly Leu Thr Thr Tyr He Glu 

260 265 270 

Ala Leu Lys Glu Ala He Glu Val Val Leu Ser He Thr Gly Ser Lys 
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275 280 285 

Asp Leu Asn Leu Leu Gly Ala Cys Ser Gly Gly lie Thr Thr Ala Thr 
290 295 300 

Leu Val Gly His Tyr Val Ala Ser Gly Glu Lys Lys Val Asn Ala Phe 
305 310 315 320 

Thr Gin Leu Val Ser Val Leu Asp Phe Glu Leu Asn Thr Gin Val Ala 

325 330 335 

Leu Phe Ala Asp Glu Lys Thr Leu Glu Ala Ala Lys Arg Arg Ser Tyr 

340 345 350 

Gin Ser Gly Val Leu Glu Gly Lys Asp Met Ala Lys Val Phe Ala Trp 
355 360 365 

Met Arg Pro Asn Asp Leu lie Trp Asn Tyr Trp Val Asn Asn Tyr Leu 
370 375 380 

Leu Gly Asn Gin Pro Pro Ala Phe Asp lie Leu Tyr Trp Asn Asn Asp 
385 390 395 400 

Thr Thr Arg Leu Pro Ala Ala Leu His Gly Glu Phe Val Glu Leu Phe 

405 410 415 

Lys Ser Asn Pro Leu Asn Arg Pro -Gly Ala Leu Glu Val Ser Gly Thr 

420 425 430 

Pro lie Asp Leu Lys Gin Val Thr Cys Asp Phe Tyr Cys Val Ala Gly 
435 440 445 

Leu Asn Asp His lie Thr Pro Trp Glu Ser Cys Tyr Lys Ser Ala Arg 
450 455 460 

Leu Leu Gly Gly Lys Cys Glu Phe lie Leu Ser Asn Ser Gly His lie 

465 470 475 480 

Gin Ser lie Leu Asn Pro Pro Gly Asn Pro Lys Ala Arg Phe Met Thr 

485 490 495 

Asn Pro Glu Leu Pro Ala Glu Pro Lys Ala Trp Leu Glu Gin Ala Gly 

500 505 510 

Lys His Ala Asp Ser Trp Trp Leu His Trp Gin Gin Trp Leu Ala Glu 
515 520 525 

Arg Ser Gly Lys Thr Arg Lys Ala Pro Ala Ser Leu Gly Asn Lys Thr 
530 535 540 

Tyr Pro Ala Gly Glu Ala Ala Pro Gly Thr Tyr Val His Glu Arg 

550 555 



(2) INFORMATION FOR SEQ ID NO: 3: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 1680 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATGCGAGAAA AGCAGGAATC GGGTAGCGTG CCGGTGCCCG CCGAGTTCAT GAGTGCACAG 60 

AGCGCCATCG TCGGCCTGCG CGGCAAGGAC CTGCTGACGA CGGTCCGCAG CCTGGCTGTC 120 

CACGGCCTGC GCCAGCCGCT GCACAGTGCG CGGCACCTGG TCGCCTTCGG AGGCCAGTTG 180 

GGCAAGGTGC TGCTGGGCGA CACCCTGCAC CAGCCGAACC CACAGGACGC CCGCTTCCAG 240 

20 GATCCATCCT GGCGCCTCAA TCCCTTCTAC CGGCGCACCC TGCAGGCCTA CCTGGCGTGG 300 

CAGAAACAAC TGCTCGCCTG GATCGACGAA AGCAACCTGG ACTGCGACGA TCGCGCCCGC 360 

GCCCGCTTCC TCGTCGCCTT GCTCTCCGAC GCCGTGGCAC CCAGCAACAG CCTGATCAAT 420 

CCACTGGCGT TAAAGGAACT GTTCAATACC GGCGGGATCA GCCTGCTCAA TGGCGTCCGC 480 

CACCTGCTCG AAGACCTGGT GCACAACGGC GGCATGCCCA GCCAGGTGAA CAAGACCGCC 540 

30 TTCGAGATCG GTCGCAACCT CGCCACCACG CAAGGCGCGG TGGTGTTCCG CAACGAGGTG 600 

CTGGAGCTGA TCCAGTACAA GCCGCTGGGC GAGCGCCAGT ACGCCAAGCC CCTGCTGATC 660 

GTGCCGCCGC AGATCAACAA GTACTACATC TTCGACCTGT CGCCGGAAAA GAGCTTCGTC 720 

35 

CAGTACGCCC TGAAGAACAA CCTGCAGGTC TTCGTCATCA GTTGGCGCAA CCCCGACGCC 780 

CAGCACCGCG AATGGGGCCT GAGCACCTAT GTCGAGGCCC TCGACCAGGC CATCGAGGTC 840 

40 AGCCGCGAGA TCACCGGCAG CCGCAGCGTG AACCTGGCCG GCGCCTGCGC CGGCGGGCTC 900 

ACCGTAGCCG CCTTGCTCGG CCACCTGCAG GTGCGCCGGC AACTGCGCAA GGTCAGTAGC 960 

GTCACCTACC TGGTCAGCCT GCTCGACAGC CAGATGGAAA GCCCGGCGAT GCTCTTCGCC 1020 

45 

GACGAGCAGA CCCTGGAGAG CAGCAAGCGC CGCTCCTACC AGCATGGCGT GCTGGACGGG 1080 

CGCGACATGG CCAAGGTGTT CGCCTGGATG CGCCCCAACG ACCTGATCTG GAACTACTGG 1140 

50 GTCAACAACT ACCTGCTCGG CAGGCAGCCG CCGGCGTTCG ACATCCTCTA CTGGAACAAC 1200 

GACAACACGC GGCTGCCCGC GGCGTTCCAC GGCGAACTGC TCGACCTGTT CAAGCACAAC 1260 

CCGCTGACCC GCCCGGGCGC GCTGGAGGTC AGCGGGACCG CGGTGGACCT GGGCAAGGTG 1320 



GCGATCGACA GCTTCCACGT CGCCGGCATC ACCGACCACA TCACGCCCTG GGACGCGGTG 1380 

-63- 



WO 99/35278 PCT/US98/00083 

TATCGCTCGG CCCTCCTGCT GGGCGGCCAG CGCCGCTTCA TCCTGTCCAA CAGCGGGCAC 1440 

ATCCAGAGCA TCCTCAACCC TCCCGGAAAC CCCAAGGCCT GCTACTTCGA GAACGACAAG 1500 

CTGAGCAGCG ATCCACGCGC CTGGTACTAC GACGCCAAGC GCGAAGAGGG CAGCTGGTGG. 1560 

CCGGTCTGGC TGGGCTGGCT GCAGGAGCGC TCGGGCGAGC TGGGCAACCC TGACTTCAAC 1620 

CTTGGCAGCG CCGCGCATCC GCCCCTCGAA GCGGCCCCGG GCACCTACGT GCATATACGC 1680 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 560 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D ) TOPOLOGY : 1 inear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Arg Glu Lys Gin Glu Ser Gly Ser Val Pro Val Pro Ala Glu Phe 
1 5 10 15 

Met Ser Ala Gin Ser Ala lie Val Gly Leu Arg Gly Lys Asp Leu Leu 

20 25 30 

Thr Thr Val Arg Ser Leu Ala Val His Gly Leu Arg Gin Pro Leu His 
35 40 45 

Ser Ala Arg His Leu Val Ala Phe Gly Gly Gin Leu Gly Lys Val Leu 
50 55 60 

Leu Gly Asp Thr Leu His Gin Pro Asn Pro Gin Asp Ala Arg Phe Gin 
65 70 75 80 

Asp Pro Ser Trp Arg Leu Asn Pro Phe Tyr Arg Arg Thr Leu Gin Ala 

85 90 95 

Tyr Leu Ala Trp Gin Lys Gin Leu Leu Ala Trp He Asp Glu Ser Asn 

100 105 110 

Leu Asp Cys Asp Asp Arg Ala Arg Ala Arg Phe Leu Val Ala Leu Leu 
115 120 125 

Ser Asp Ala Val Ala Pro Ser Asn Ser Leu He Asn Pro Leu Ala Leu 
130 135 140 

Lys Glu Leu Phe Asn Thr Gly Gly He Ser Leu Leu Asn Gly Val Arg 
145 150 155 160 
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His Leu Leu Glu Asp Leu Val His Asn Gly Gly Met Pro Ser Gin Val 

165 170 175 

Asn Lys Thr Ala Phe Glu lie Gly Arg Asn Leu Ala Thr Thr Gin Gly 
5 180 185 190 

Ala Val Val Phe Arg Asn Glu Val Leu Glu Leu lie Gin Tyr Lys Pro 
195 200 205 

10 Leu Gly Glu Arg Gin Tyr Ala Lys Pro Leu Leu lie Val Pro Pro Gin 

210 215 220 

lie Asn Lys Tyr Tyr lie Phe Asp Leu Ser Pro Glu Lys Ser Phe Val 
225 230 235 240 

15 

Gin Tyr Ala Leu Lys .Asn Asn Leu Gin Val Phe Val lie Ser Trp Arg 

245 250 255 

Asn Pro Asp Ala Gin His Arg Glu Trp Gly Leu Ser Thr Tyr Val Glu 
20 260 265 270 

Ala Leu Asp Gin Ala lie Glu Val Ser Arg Glu lie Thr Gly Ser Arg 
275 280 285 

25 Ser Val Asn Leu Ala Gly Ala Cys Ala Gly Gly Leu Thr Val Ala Ala 

290 295 300 



30 



Leu Leu Gly His Leu Gin Val Arg Arg Gin Leu Arg Lys Val Ser Ser 
305 310 315 320 

Val Thr Tyr Leu Val Ser Leu Leu Asp Ser Gin Met Glu Ser Pro Ala 

325 330 335 

Met Leu Phe Ala Asp Glu Gin Thr Leu Glu Ser Ser Lys Arg Arg Ser 
35 340 345 350 

Tyr Gin His Gly Val Leu Asp Gly Arg Asp Met Ala Lys Val Phe Ala 
355 360 365 

40 Trp Met Arg Pro Asn Asp Leu He Trp Asn Tyr Trp Val Asn Asn Tyr 

370 375 380 

Leu Leu Gly Arg Gin Pro Pro Ala Phe Asp He Leu Tyr Trp Asn Asn 
385 390 395 400 

45 

Asp Asn Thr Arg Leu Pro Ala Ala Phe His Gly Glu Leu Leu Asp Leu 

405 410 415 

Phe Lys His Asn Pro Leu Thr Arg Pro Gly Ala Leu Glu Val Ser Gly 
50 420 425 430 

Thr Ala Val Asp Leu Gly Lys Val Ala He Asp Ser Phe His Val Ala 
435 440 445 



55 Gly lie Thr Asp His He Thr Pro Trp Asp Ala Val Tyr Arg Ser Ala 

450 455 460 
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10 



20 



Leu Leu Leu Gly Gly Gin Arg Arg Phe lie Leu Ser Asn Ser Gly His 
465 470 475 480 

lie Gin Ser lie Leu Asn Pro Pro Gly Asn Pro Lys Ala Cys Tyr Phe 

485 490 495 

Glu Asn Asp Lys Leu Ser Ser Asp Pro Arg Ala Trp Tyr Tyr Asp Ala 

500 505 510 

Lys Arg Glu Glu Gly Ser Trp Trp Pro Val Trp Leu Gly Trp Leu Gin 
515 520 525 



Glu Arg Ser Gly Glu Leu Gly Asn Pro Asp Phe Asn Leu Gly Ser Ala 

15 530 535 540 

Ala His Pro Pro Leu Glu Ala Ala Pro Gly Thr Tyr Val His lie Arg 

545 550 555 560 



(2) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 29 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
35 GGAGAATTCC CGATGAGCCA GAAGAACAA 29 

r ■ ■ 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS : 
40 (A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

50 

CTGGAAGCTT TTGATCGTTC ATGCACGTA 29 

(2) INFORMATION FOR SEQ ID NO: 7: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GTGGAATTCA TGCGTGAAAA GCAGGAATC 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GGCCAAGCTT TTGAGCGTAT ATGCACGTA 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1731 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

ATGGCTGCAT CTTTCTCTGT CCCCTCTATG ATCATGGAAG AGGAAGGGAG ATTTGAGGCG 60 

GAAGTTGCGG AAGTGCAGAC TTGGTGGAGC TCAGAGAGGT TCAAGCTAAC AAGGCGTCCT 120 

TACACGGCCC GTGACGTGGT GGCTCTACGT GGTCATCTCA AGCAAGGTTA TGCTTCGAAC 180 

GAGATGGCTA AGAAGCTGTG GAGAACGCTC AAGAGTCACC AAGTCAACGG CACGGCGTCT 240 

CGCACGTTTG GTGCCTTGGA CCCTGTTCAG GTGACAATGA TGGCTAAACA TTTAGACACC 300 

ATTTATGTCT CTGGTTGGCA GTGCTCGTCT ACTCACACCT CCACTAACGA GCCTGGTCCG 360 

GATCTTGCTG ACTATCCATA CGATACCGTT CCTAACAAGG TCGAACATCT CTTCTTCGCT 420 

CAGCAGTACC ATGACAGAAA ACAGAGGGAG GCGAGAATGA GCATGAGCAG AGAAGAAAGA 480 

GCAAAAACTC CGTTTGTGGA CTACTTGAAG CCCATCATCG CCGACGGAGG AACCGGCTTC 540 
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GGCGGTACCA CTGCCACCGT AAAACTCTGC AAACTCTTCG TTGAAAGAGG AGCCGCTGGG 
GTCCACATCG AGGACCAGTC CTCCGTCACC AAGAAGTGTG GCCACATGGC CGGAAAAGTC 

5 

CTCGTGGCAG TCAGTGAACA CATCAACCGC CTTGTTGCGG CTCGGCTCCA GTTCGACGTG 
ATGGGCACAG AGACCGTCCT GGTCGCTAGA ACGGACGCGG TCGCGCCCAC TCTGATCCAA 
10 TCGAACATTG ACTCAAGGGA CCACCAGTTC ATCCTCGGTG TCACTAACCC AAACCTTAGA 
GGCAAGAGTT TGTCCTCGCT TCTGGCCGAG GGAATGGCTG TAGGCAATAA TGGTCCAGCG 



15 



25 



35 



40 GCTGGAATGG GCGAAGGGAC TAGCCTTGTG GTGGCCAAGT CCAGAATGTA A 
(2) INFORMATION FOR SEQ ID NO: 10: 



45 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 amino acids 
(8) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



600 
660 
720 
780 
840 
900 



TTGCAAGCGA TTGAGGATCA ATGGCTTAGC TCAGCTCGTC TCATGACTTT CTCGGACGCT 960 

GTCGTGGAGG CTCTCAAGCG CATGAACCTA AGTGAGAATG AGAAGAGCCG GAGAGTGACC 1020 

GAGTGGCTAA TCCATGCAAG GTACGAGAAC TGCCTTTCAA ACGAGCAAGG CCGAGAATTA 1080 ' 

20 GCAGCAAAAC TCGGTGTGAC TGATCTTTTC TGGGACTGGG ACTTGCCCAG AACCAGAGAA 1140 

GGATTCTACC GGTTCCAAGG CTCGGTCACA GCAGCCGTGG TCCGTGGCTG GGCCTTTGCA 1200 



CAGATAGCTG ATCTCATCTG GATGGAAACC GCAAGCCCTG ACCTCAACGA ATGCACCCAA 1260 

TTCGCAGAAG GAGTCAAGTC CAAGACACCA GAGGTAATGC TCGCCTACAA CCTCTCCCCA 1320 

TCCTTCAACT GGGACGCTTC TGGTATGACG GATCAGCAGA TGATGGAGTT CATTCCACGA 1380 

30 ATCGCCAGGC TCGGTTATTG CTGGCAGTTT ATAACCCTTG CGGGTTTCCA TGCGGATGCT 1440 

CTTGTGGTCG ATACGTTTGC AAAGGATTAC GCGAGGAGAG GGATGCTGGC TTATGTCGAG 1500 



AGGATACAGA GAGAAGAGAG GAGCAATGGG GTTGACACAT TGGCTCATCA GAAATGGTCA 1560 
GGTGCTAATT ACTATGATCG TTATCTTAAG ACCGTCCAAG GTGGAATCTC CTCCACTGCA 1620 
GCCATGGGCA AAGGTGTTAC CGAGGAACAA TTCAAAGAGA CCTGGACGAG GCCGGGAGCT 1680 



1731 



50 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Ala Ala Ser Phe Ser Val Pro Ser Met lie Met Glu Glu Glu Gly 
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15 10 15 

Arg Phe Glu Ala Glu Val Ala Glu Val Gin Thr Trp Trp Ser Ser Glu 

20 25 30 

5 

Arg Phe Lys Leu Thr Arg Arg Pro Tyr Thr Ala Arg Asp Val Val Ala 
35 40 45 

Leu Arg Gly His Leu Lys Gin Gly Tyr Ala Ser Asn Glu Met Ala Lys 
10 50 55 60 

Lys Leu Trp Arg Thr Leu Lys Ser His Gin Val Asn Gly Thr Ala Ser 
65 70 75 80 

15 Arg Thr Phe Gly Ala Leu Asp Pro Val Gin Val Thr Met Met Ala Lys 

85 90 95 

His Leu Asp Thr lie Tyr Val Ser Gly Trp Gin Cys Ser Ser Thr His 

100 105 no 

20 

Thr Ser Thr Asn Glu Pro Gly Pro Asp Leu Ala Asp Tyr Pro Tyr Asp 
115 120 125 

Thr Val Pro Asn Lys Val Glu His Leu Phe Phe Ala Gin Gin Tyr His 
25 130 135 140 

Asp Arg Lys Gin Arg Glu Ala Arg Met Ser Met Ser Arg Glu Glu Arg ' 
145 150 155 160 



30 



35 



50 



Ala Lys Thr Pro Phe Val Asp Tyr Leu Lys Pro He He Ala Asp Gly 

165 170 175 

Gly Thr Gly Phe Gly Gly Thr Thr Ala Thr Val Lys Leu Cys Lys Leu 

180 185 190 



| Phe Val Glu Arg Gly Ala Ala Gly Val His He Glu Asp Gin Ser Ser 

" 195 200 205 

Val Thr Lys Lys Cys Gly His Met Ala Gly Lys Val Leu Val Ala Val 
40 210 215 220 

Ser Glu His He Asn Arg Leu Val Ala Ala Arg Leu Gin Phe Asp Val 
225 230 235 240 

45 Met Gly Thr Glu Thr Val Leu Val Ala Arg Thr Asp Ala Val Ala Pro 

245 250 255 



Thr Leu lie Gin Ser Asn lie Asp Ser Arg Asp His Gin Phe He Leu 

260 265 270 

Gly Val Thr Asn Pro Asn Leu Arg Gly Lys Ser Leu Ser Ser Leu Leu 
275 280 285 



Ala Glu Gly Met Ala Val Gly Asn Asn Gly Pro Ala Leu Gin Ala He 
55 290 295 300 
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Glu Asp Gin Trp Leu Ser Ser Ala Arg Leu Met Thr Phe Ser Asp Ala 
305 310 315 320 

Val Val Glu Ala Leu Lys Arg Met Asn Leu Ser Glu Asn Glu Lys Ser 

325 330 335 

Arg Arg Val Thr Glu Trp Leu lie His Ala Arg Tyr Glu Asn Cys Leu 

340 345 350 

Ser Asn Glu Gin Gly Arg Glu Leu Ala Ala Lys Leu Gly Val Thr Asp 
355 360 365 

Leu Phe Trp Asp Trp Asp Leu Pro Arg Thr Arg Glu Gly Phe Tyr Arg 
370 375 380 

Phe Gin Gly Ser Val Thr Ala Ala Val Val Arg Gly Trp Ala Phe Ala 
385 390 395 400 

Gin lie Ala Asp Leu lie Trp Met Glu Thr Ala Ser Pro Asp Leu Asn 

405 410 415 

Glu Cys Thr Gin Phe Ala Glu Gly Val Lys Ser Lys Thr Pro Glu Val 

420 425 430 

Met Leu Ala Tyr Asn Leu Ser Pro Ser Phe Asn Trp Asp Ala Ser Gly 
435 440 445 

Met Thr Asp Gin Gin Met Met Glu Phe lie Pro Arg lie Ala Arg Leu 
450 455 460 

Gly Tyr Cys Trp Gin Phe lie Thr Leu Ala Gly Phe His Ala Asp Ala 

470 475 480 



Leu Val Val Asp Thr Phe Ala Lys Asp Tyr Ala Arg Arg Gly Met Leu 

485 490 495 

Ala Tyr Val Glu Arg lie Gin Arg Glu Glu Arg Ser Asn Gly Val Asp 

500 505 510 

Thr Leu Ala His Gin Lys Trp Ser Gly Ala Asn Tyr Tyr Asp Arg Tyr 
515 520 525 

Leu Lys Thr Val Gin Gly Gly lie Ser Ser Thr Ala Ala Met Gly Lys 
530 535 540 

Gly Val Thr Glu Glu Gin Phe Lys Glu Thr Trp Thr Arg Pro Gly Ala 
545 550 555 560 

Ala Gly Met Gly Glu Gly Thr Ser Leu Val Val Ala Lys Ser Arg Met 

565 570 575 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 26 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

10 

ACTGAAGCTT TGGGCAAAGG TGTTAC 26 
(2) INFORMATION FOR SEQ ID NO: 12: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 
. (B) TYPE : nucleic acid 

I (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 



25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GTGGTCTAGA AGTTTTTCTG CGAAGATG 28 
(2) INFORMATION FOR SEQ ID NO: 13: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 102 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 



40 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GGCAAAGGTG TTACCGAGGA ACAATTCAAA GAGACCTGGA CGAGGCCGGG AGCTGCTGGA 60 

45 ATGGGCGAAG GGACTAGCCT TGTGGTGGCC AAGTCCAGAA TG 102 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 34 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

55 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Gly Lys Gly Val Thr Glu Glu Gin Phe Lys Glu Thr Trp Thr Arg Pro 
15 10 15 

Gly Ala Ala Gly Met Gly Glu Gly Thr Ser Leu Val Val Ala Lys Ser 

20 25 30 

Arg Met 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1677 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

ATGAGCCAGA AGAACAATAA CGAGCTTCCC AAGCAAGCCG CGGAAAACAC GCTGAACCTG 60 

AATCCGGTGA TCGGCATCCG GGGCAAGGAC CTGCTCACCT CCGCGCGCAT GGTCCTGCTC 120 

CAGGCGGTGC GCCAGCCGCT GCACAGCGCC AGGCACGTGG CGCATTTCAG CCTGGAGCTG 180 

' !■" 

AAGAACGTCC TGCTCGGCCA GTCGGAGCTA CGCCCAGGCG ATGACGACCG ACGCTTTTCC 24 0 

GATCCGGCCT GGAGCCAGAA TCCACTGTAC AAGCGCTACA TGCAGACCTA CCTGGCCTGG 300 

CGCAAGGAGC TGCACAGCTG GATCAGCCAC AGCGACCTGT CGCCGCAGGA CATCAGTCGT 360 

GGCCAGTTCG TCATCAACCT GCTGACCGAG GCGATGTCGC CGACCAACAG CCTGAGCAAC 420 

CCGGCGGCGG TCAAGCGCTT CTTCGAGACC GGCGGCAAGA GCCTGCTGGA CGGCCTCGGC 480 

CACCTGGCCA AGGACCTGGT GAACAACGGC GGGATGCCGA GCCAGGTGGA CATGGACGCC 540 

TTCGAGGTGG GCAAGAACCT GGCCACCACC GAGGGCGCCG TGGTGTTCCG CAACGACGTG 600 

CTGGAACTGA TCCAGTACCG GCCGATCACC GAGTCGGTGC ACGAACGCCC GCTGCTGGTG 660 

GTGCCGCCGC AGATCAACAA GTTCTACGTC TTCGACCTGT CGCCGGACAA GAGCCTGGCG 720 

CGCTTCTGCC TGCGCAACGG CGTGCAGACC TTCATCGTCA GTTGGCGCAA CCCGACCAAG 780 

TCGCAGCGCG AATGGGGCCT GACCACCTAT ATCGAGGCGC TCAAGGAGGC CATCGAGGTA 840 

GTCCTGTCGA TCACCGGCAG CAAGGACCTC AACCTCCTCG GCGCCTGCTC CGGCGGGATC 900 
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ACCACCGCGA CCCTGGTCGG CCACTACGTG GCCAGCGGCG AGAAGAAGGT CAACGCCTTC 960 

ACCCAACTGG TCAGCGTGCT CGACTTCGAA CTGAATACCC AGGTCGCGCT GTTCGCCGAC 1020 

5 GAGAAGACTC TGGAGGCCGC CAAGCGTCGT TCCTACCAGT CCGGCGTGCT GGAGGGCAAG 1080 

GACATGGCCA AGGTGTTCGC CTGGATGCGC CCCAACGACC TGATCTGGAA CTACTGGGTC 1140 

AACAACTACC TGCTCGGCAA CCAGCCGCCG GCGTTCGACA TCCTCTACTG GAACAACGAC 1200 

ACCACGCGCC TGCCCGCCGC GCTGCACGGC GAGTTCGTCG AACTGTTCAA GAGCAACCCG 1260 

CTGAACCGCC CCGGCGCCCT GGAGGTCTCC GGCACGCCCA TCGACCTGAA GCAGGTGACT 1320 

15 TGCGACTTCT ACTGTGTCGC CGGTCTGAAC GACCACATCA CCCCCTGGGA GTCGTGCTAC 1380 

y AAGTCGGCCA GGCTGCTGGG TGGCAAGTGC GAGTTCATCC TCTCCAACAG CGGTCACATC 1440 

CAGAGCATCC TCAACCCACC GGGCAACCCC AAGGCACGCT TCATGACCAA TCCGGAACTG 1500 

CCCGCCGAGC CCAAGGCCTG GCTGGAACAG GCCGGCAAGC ACGCCGACTC GTGGTGGTTG 1560 

* 

CACTGGCAGC AATGGCTGGC CGAACGCTCC GGCAAGACCC GCAAGGCGCC CGCCAGCCTG 1620 

25 GGCAACAAGA CCTATCCGGC CGGCGAAGCC GCGCCCGGAA CCTACGTGCA TGAACGA 1677 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH : 1680 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 



35 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

ATGCGTGAAA AGCAGGAATC GGGTAGCGTG CCGGTGCCCG CCGAGTTCAT GAGTGCACAG 60 

AGCGCCATCG TCGGCCTGCG CGGCAAGGAC CTGCTGACGA CGGTCCGCAG CCTGGCTGTC 120 

45 CACGGCCTGC GCCAGCCGCT GCACAGTGCG CGGCACCTGG TCGCCTTCGG AGGCCAGTTG 180 

GGCAAGGTGC TGCTGGGCGA CACCCTGCAC CAGCCGAACC CACAGGACGC CCGCTTCCAG 240 

GATCCATCCT GGCGCCTCAA TCCCTTCTAC CGGCGCACCC TGCAGGCCTA CCTGGCGTGG 300 

CAGAAACAAC TGCTCGCCTG GATCGACGAA AGCAACCTGG ACTGCGACGA TCGCGCCCGC 360 

GCCCGCTTCC TCGTCGCCTT GCTCTCCGAC GCCGTGGCAC CCAGCAACAG CCTGATCAAT 420 

55 CCACTGGCGT TAAAGGAACT GTTCAATACC GGCGGGATCA GCCTGCTCAA TGGCGTCCGC 480 
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CACCTGCTCG 


AAGACCTGGT 


GCACAACGGC 


GGCATGCCCA 


GCCAGGTGAA 


CAAGACCGCC 


540 




TTCGAGATCG 


GTCGCAACCT 


CGCCACCACG 


CAAGGCGCGG 


TGGTGTTCCG 


CAACGAGGTG 


600 


5 


CTGGAGCTGA 


TCCAGTACAA 


GCCGCTGGGC 


GAGCGCCAGT 


ACGCCAAGCC 


CCTGCTGATC 


660 




GTGCCGCCGC 


AGATCAACAA 


GTACTACATC 


TTCGACCTGT 


CGCCGGAAAA 


GAGCTTCGTC 


720 


10 


CAGTACGCCC 


TGAAGAACAA 


CCTGCAGGTC 


TTCGTCATCA 


GTTGGCGCAA 


CCCCGACGCC 


780 


CAGCACCGCG 


AATGGGGCCT 


GAGCACCTAT 


GTCGAGGCCC 


TCGACCAGGC 


CATCGAGGTC 


840 




AGCCGCGAGA 


TCACCGGCAG 


CCGCAGCGTG 


AACCTGGCCG 


GCGCCTGCGC 


CGGCGGGCTC 


900 


15 


ACCGTAGCCG 


CCTTGCTCGG 


CCACCTGCAG 


GTGCGCCGGC 


AACTGCGCAA 


GGTCAGTAGC 


960 




GTCACCTACC 


TGGTCAGCCT 


GCTCGACAGC 


CAGATGGAAA 


GCCCGGCGAT 


GCTCTTCGCC 


1020 


20 


GACGAGCAGA 


CCCTGGAGAG 


CAGCAAGCGC 


CGCTCCTACC 


AGCATGGCGT 


GCTGGACGGG 


1080 


CGCGACATGG 


CCAAGGTGTT 


CGCCTGGATG 


CGCCCCAACG 


ACCTGATCTG 


GAACTACTGG 


1140 




GTCAACAACT 


ACCTGCTCGG 


CAGGCAGCCG 


CCGGCGTTCG 


ACATCCTCTA 


CTGGAACAAC 


1200 


25 


GACAACACGC 


GGCTGCCCGC 


GGCGTTCCAC 


GGCGAACTGC 


TCGACCTGTT 


CAAGCACAAC 


1260 




CCGCTGACCC 


GCCCGGGCGC 


GCTGGAGGTC 


AGCGGGACCG 


CGGTGGACCT 


GGGCAAGGTG 


1320 


30 


GCGATCGACA 


GCTTCCACGT 


CGCCGGCATC 


ACCGACCACA 


TCACGCCCTG 


GGACGCGGTG 


1380 


TATCGCTCGG 


CCCTCCTGCT 


* 

GGGCGGCCAG 


CGCCGCTTCA 


TCCTGTCCAA 


CAGCGGGCAC 


1440 




ATCCAGAGCA 


TCCTCAACCC 


TCCCGGAAAC 


CCCAAGGCCT 


GCTACTTCGA 


GAACGACAAG 


1500 


35 


CTGAGCAGCG 


ATCCACGCGC 


CTGGTACTAC 


GACGCCAAGC 


GCGAAGAGGG 


CAGCTGGTGG 


1560 




CCGGTCTGGC 


TGGGCTGGCT 


GCAGGAGCGC 


TCGGGCGAGC 


TGGGCAACCC 


TGACTTCAAC 


1620 




CTTGGCAGCG 


CCGCGCATCC 


GCCCCTCGAA 


GCGGCCCCGG 


GCACCTACGT 


GCATATACGC 


1680 



40 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 1791 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: 



50 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
ATGAGCCAGA AGAACAATAA CGAGCTTCCC AAGCAAGCCG CGGAAAACAC 

.74- 
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AATCCGGTGA TCGGCATCCG GGGCAAGGAC CTGCTCACCT CCGCGCGCAT GGTCCTGCTC 120 

CAGGCGGTGC GCCAGCCGCT GCACAGCGCC AGGCACGTGG CGCATTTCAG CCTGGAGCTG 180 

AAGAACGTCC TGCTCGGCCA GTCGGAGCTA CGCCCAGGCG ATGACGACCG ACGCTTTTCC 240 

GATCCGGCCT GGAGCCAGAA TCCACTGTAC AAG CGCTACA TGCAGACCTA CCTGGCCTGG 300 

CGCAAGGAGC TGCACAGCTG GATCAGCCAC AGCGACCTGT CGCCGCAGGA CATCAGTCGT 360 

GGCCAGTTCG TCATCAACCT GCTGACCGAG GCGATGTCGC CGACCAACAG CCTGAGCAAC 420 

CCGGCGGCGG TCAAGCGCTT CTTCGAGACC GGCGGCAAGA GCCTGCTGGA CGGCCTCGGC 480 

CACCTGGCCA AGGACCTGGT GAACAACGGC GGGATGCCGA GCCAGGTGGA CATGGACGCC 540 

TTCGAGGTGG GCAAGAACCT GGCCACCACC GAGGGCGCCG TGGTGTTCCG CAACGACGTG 600 

CTGGAACTGA TCCAGTACCG GCCGATCACC GAGTCGGTGC ACGAACGCCC GCTGCTGGTG 660 

GTGCCGCCGC AGATCAACAA GTTCTACGTC TTCGACCTGT CGCCGGACAA GAGCCTGGCG 720 

CGCTTCTGCC TGCGCAACGG CGTGCAGACC TTCATCGTCA GTTGGCGCAA CCCGACCAAG 780 

TCGCAGCGCG AATGGGGCCT GACCACCTAT ATCGAGGCGC TCAAGGAGGC CATCGAGGTA 840 

GTCCTGTCGA TCACCGGCAG CAAGGACCTC AACCTCCTCG GCGCCTGCTC CGGCGGGATC 900 

ACCACCGCGA CCCTGGTCGG CCACTACGTG GCCAGCGGCG AGAAGAAGGT CAACGCCTTC 960 

ACCCAACTGG TCAGCGTGCT CGACTTCGAA CTGAATACCC AGGTCGCGCT GTTCGCCGAC 1020 

GAGAAGACTC TGGAGGCCGC CAAGCGTCGT TCCTACCAGT CCGGCGTGCT GGAGGGCAAG 1080 

GACATGGCCA AGGTGTTCGC CTGGATGCGC CCCAACGACC TGATCTGGAA CTACTGGGTC 1140 

AACAACTACC TGCTCGGCAA CCAGCCGCCG GCGTTCGACA TCCTCTACTG GAACAACGAC 1200 

ACCACGCGCC TGCCCGCCGC GCTGCACGGC GAGTTCGTCG AACTGTTCAA GAGCAACCCG 1260 

CTGAACCGCC CCGGCGCCCT GGAGGTCTCC GGCACGCCCA TCGACCTGAA GCAGGTGACT 1320 

TGCGACTTCT ACTGTGTCGC CGGTCTGAAC GACCACATCA CCCCCTGGGA GTCGTGCTAC 1380 

AAGTCGGCCA GGCTGCTGGG TGGCAAGTGC GAGTTCATCC TCTCCAACAG CGGTCACATC 1440 

CAGAGCATCC TCAACCCACC GGGCAACCCC AAGGCACGCT TCATGACCAA TCCGGAACTG 1500 

CCCGCCGAGC CCAAGGCCTG GCTGGAACAG GCCGGCAAGC ACGCCGACTC GTGGTGGTTG 1560 

CACTGGCAGC AATGGCTGGC CGAACGCTCC GGCAAGACCC GCAAGGCGCC CGCCAGCCTG 1620 

GGCAACAAGA CCTATCCGGC CGGCGAAGCC GCGCCCGGAA CCTACGTGCA TGAACGATCA 1680 

AAAGCTTTGG GCAAAGGTGT TACCGAGGAA CAATTCAAAG AGACCTGGAC GAGGCCGGGA 1740 
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GCTGCTGGAA TGGGCGAAGG GACTAGCCTT GTGGTGGCCA AGTCCAGAAT G 1791 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 597 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Met Ser Gin Lys Asn Asn Asn Glu Leu Pro Lys Gin Ala Ala Glu Asn 
15 10 15 

Thr Leu Asn Leu Asn Pro Val lie Gly lie Arg Gly Lys Asp Leu Leu 

20 25 30 

Thr Ser Ala Arg Met Val Leu Leu Gin Ala Val Arg Gin Pro Leu His 
35 40 45 

Ser Ala Arg His Val Ala His Phe Ser Leu Glu Leu Lys Asn Val Leu 
50 55 60 

Leu Gly Gin Ser Glu Leu Arg Pro Gly Asp Asp Asp Arg Arg Phe Ser 
65 70 75 80 

Asp Pro Ala Trp Ser Gin Asn Pro Leu Tyr Lys Arg Tyr Met Gin Thr 

85 90 95 

Tyr Leu Ala Trp Arg Lys Glu Leu His Ser Trp lie Ser His Ser Asp 

100 105 110 

Leu Ser Pro Gin Asp lie Ser Arg Gly Gin Phe Val lie Asn Leu Leu 
115 120 125 

Thr Glu Ala Met Ser Pro Thr Asn Ser Leu Ser Asn Pro Ala Ala Val 
130 135 140 

Lys Arg Phe Phe Glu Thr Gly Gly Lys Ser Leu Leu Asp Gly Leu Gly 
145 150 155 160 

His Leu Ala Lys Asp Leu Val Asn Asn Gly Gly Met Pro Ser Gin Val 

165 170 175 

Asp Met Asp Ala Phe Glu Val Gly Lys Asn Leu Ala Thr Thr Glu Gly 

180 185 190 

Ala Val Val Phe Arg Asn Asp Val Leu Glu Leu He Gin Tyr Arg Pro 
195 200 205 



-76- 



lie Thr Glu Ser Val His Glu Arg Pro Leu Leu Val Val Pro Pro Gin 
210 215 220 

He Asn Lys Phe Tyr Val Phe Asp Leu Ser Pro Asp Lys Ser Leu Ala 
225 230 235 240 

Arg Phe Cys Leu Arg Asn Gly Val Gin Thr Phe He Val Ser Trp Arg 

245 250 255 

Asn Pro Thr Lys Ser Gin Arg Glu Trp Gly Leu Thr Thr Tyr He Glu 

260 265 270 

Ala Leu Lys Glu Ala He Glu Val Val Leu Ser He Thr Gly Ser Lys 
275 280 285 

Asp Leu Asn Leu Leu Gly Ala Cys Ser Gly Gly He Thr Thr Ala Thr 
290 295 300 

Leu Val Gly His Tyr Val Ala Ser Gly Glu Lys Lys Val Asn Ala Phe 
3° 5 310 315 320 

Thr Gin Leu Val Ser Val Leu Asp Phe Glu Leu Asn Thr Gin Val Ala 

325 330 335 

Leu Phe Ala Asp Glu Lys Thr Leu Glu Ala Ala Lys Arg Arg Ser Tyr 

340 345 350 

Gin Ser Gly Val Leu Glu Gly Lys Asp Met Ala Lys Val Phe Ala Trp 
355 360 



Met Arg Pro Asn Asp Leu He Trp Asn Tyr Trp Val Asn Asn Tyr Leu 
370 375 380 

Leu Gly Asn Gin Pro Pro Ala Phe Asp He Leu Tyr Trp Asn Asn Asp 
385 390 395 400 

Thr Thr Arg Leu Pro Ala Ala Leu His Gly Glu Phe Val Glu Leu Phe 

405 410 415 

Lys Ser Asn Pro Leu Asn Arg Pro Gly Ala Leu Glu Val Ser Gly Thr 

420 425 430 

Pro He Asp Leu Lys Gin Val Thr Cys Asp Phe Tyr Cys Val Ala Gly 
435 440 445 

Leu Asn Asp His He Thr Pro Trp Glu Ser Cys Tyr Lys Ser Ala Arg 
450 455 460 

Leu Leu Gly Gly Lys Cys Glu Phe He Leu Ser Asn Ser Gly His lie 
465 470 475 480 

Gin Ser He Leu Asn Pro Pro Gly Asn Pro Lys Ala Arg Phe Met Thr 

485 490 495 

Asn Pro Glu Leu Pro Ala Glu Pro Lys Ala Trp Leu Glu Gin Ala Gly 

500 505 ' 510 
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Lys His Ala Asp Ser Trp Trp Leu His Trp Gin Gin Trp Leu Ala Glu 
515 520 525 

Arg Ser Gly Lys Thr Arg Lys Ala Pro Ala Ser Leu Gly Asn Lys Thr 
530 535 540 

Tyr Pro Ala Gly Glu Ala Ala Pro Gly Thr Tyr Val His Glu Arg Ser 
545 550 555 560 

Lys Ala Leu Gly Lys Gly Val Thr Glu Glu Gin Phe Lys Glu Thr Trp 

565 570 575 



Thr Arg Pro Gly Ala Ala Gly Met Gly Glu Gly Thr Ser Leu Val Val 
15 580 . 585 590 

Ala Lys Ser Arg Met 
595 

20 (2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1794 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

ATGCGTGAAA AGCAGGAATC GGGTAGCGTG CCGGTGCCCG CCGAGTTCAT GAGTGCACAG 60 

AGCGCCATCG TCGGCCTGCG CGGCAAGGAC CTGCTGACGA CGGTCCGCAG CCTGGCTGTC 120 

CACGGCCTGC GCCAGCCGCT GCACAGTGCG CGGCACCTGG TCGCCTTCGG AGGCCAGTTG 180 

40 GGCAAGGTGC TGCTGGGCGA CACCCTGCAC CAGCCGAACC CACAGGACGC CCGCTTCCAG 240 

GATCCATCCT GGCGCCTCAA TCCCTTCTAC CGGCGCACCC TGCAGGCCTA CCTGGCGTGG 300 

CAGAAACAAC TGCTCGCCTG GATCGACGAA AGCAACCTGG ACTGCGACGA TCGCGCCCGC 360 

45 

GCCCGCTTCC TCGTCGCCTT GCTCTCCGAC GCCGTGGCAC CCAGCAACAG CCTGATCAAT 420 

CCACTGGCGT TAAAGGAACT GTTCAATACC GGCGGGATCA GCCTGCTCAA TGGCGTCCGC 480 

50 CACCTGCTCG AAGACCTGGT GCACAACGGC GGCATGCCCA GCCAGGTGAA CAAGACCGCC 540 

TTCGAGATCG GTCGCAACCT CGCCACCACG CAAGGCGCGG TGGTGTTCCG CAACGAGGTG 600 

CTGGAGCTGA TCCAGTACAA GCCGCTGGGC GAGCGCCAGT ACGCCAAGCC CCTGCTGATC 660 



55 



GTGCCGCCGC AGATCAACAA GTACTACATC TTCGACCTGT CGCCGGAAAA GAGCTTCGTC 720 
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CAGTACGCCC TGAAGAACAA CCTGCAGGTC TTCGTCATCA GTTGGCGCAA CCCCGACGCC 780 

CAGCACCGCG AATGGGGCCT GAGCACCTAT GTCGAGGCCC TCGACCAGGC CATCGAGGTC 840 

AGCCGCGAGA TCACCGGCAG CCGCAGCGTG AACCTGGCCG GCGCCTGCGC CGGCGGGCTC 900 

ACCGTAGCCG CCTTGCTCGG CCACCTGCAG GTGCGCCGGC AACTGCGCAA GGTCAGTAGC 960 

GTCACCTACC TGGTCAGCCT GCTCGACAGC CAGATGGAAA GCCCGGCGAT GCTCTTCGCC 1020 

GACGAGCAGA CCCTGGAGAG CAGCAAGCGC CGCTCCTACC AGCATGGCGT GCTGGACGGG 1080 

CGCGACATGG CCAAGGTGTT CGCCTGGATG CGCCCCAACG ACCTGATCTG GAACTACTGG 1140 

GTCAACAACT ACCTGCTCGG CAGGCAGCCG CCGGCGTTCG ACATCCTCTA CTGGAACAAC 1200 

GACAACACGC GGCTGCCCGC GGCGTTCCAC GGCGAACTGC TCGACCTGTT CAAGCACAAC 1260 

CCGCTGACCC GCCCGGGCGC GCTGGAGGTC AGCGGGACCG CGGTGGACCT GGGCAAGGTG 1320 

GCGATCGACA GCTTCCACGT CGCCGGCATC ACCGACCACA TCACGCCCTG GGACGCGGTG 13 80 

TATCGCTCGG CCCTCCTGCT GGGCGGCCAG CGCCGCTTCA TCCTGTCCAA CAGCGGGCAC 1440 

ATCCAGAGCA TCCTCAACCC TCCCGGAAAC CCCAAGGCCT GCTACTTCGA GAACGACAAG 1500 

CTGAGCAGCG ATCCACGCGC CTGGTACTAC GACGCCAAGC GCGAAGAGGG CAGCTGGTGG 1560 

CCGGTCTGGC TGGGCTGGCT GCAGGAGCGC TCGGGCGAGC TGGGCAACCC TGACTTCAAC 1620 

CTTGGCAGCG CCGCGCATCC GCCCCTCGAA GCGGCCCCGG GCACCTACGT GCATATACGC 1680 

TCAAAAGCTT TGGGCAAAGG TGTTACCGAG GAACAATTCA AAGAGACCTG GACGAGGCCG 1740 

GGAGCTGCTG GAATGGGCGA AGGGACTAGC CTTGTGGTGG CCAAGTCCAG AATG 1794 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 598 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Arg Glu Lys Gin Glu Ser Gly Ser Val Pro Val Pro Ala Glu Phe 
15 io 15 

Met Ser Ala Gin Ser Ala lie Val Gly Leu Arg Gly Lys Asp Leu Leu 

20 25 30 
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Thr Thr Val Arg Ser Leu Ala Val His Gly Leu Arg Gin Pro Leu His 
35 40 45 

Ser Ala Arg His Leu Val Ala Phe Gly Gly Gin Leu Gly Lys Val Leu 
50 55 60 

Leu Gly Asp Thr Leu His Gin Pro Asn Pro Gin Asp Ala Arg Phe Gin 
65 70 75 80 

Asp Pro Ser Trp Arg Leu Asn Pro Phe Tyr Arg Arg Thr Leu Gin Ala 

85 90 95 

Tyr Leu Ala Trp Gin Lys Gin Leu Leu Ala Trp lie Asp Glu Ser Asn 

100 105 110 

Leu Asp Cys Asp Asp Arg Ala Arg Ala Arg Phe Leu Val Ala Leu Leu 
115 120 125 

Ser Asp Ala Val Ala Pro Ser Asn Ser Leu lie Asn Pro Leu Ala Leu 
130 135 140 

Lys Glu Leu Phe Asn Thr Gly Gly lie Ser Leu Leu Asn Gly Val Arg 
145 150 155 160 

His Leu Leu Glu Asp Leu Val His Asn Gly Gly Met Pro Ser Gin Val 

165 170 175 

Asn Lys Thr Ala Phe Glu He Gly Arg Asn Leu Ala Thr Thr Gin Gly 

180 185 190 

Ala Val Val Phe Arg Asn Glu Val Leu Glu Leu He Gin Tyr Lys Pro 
195 200 205 

Leu Gly Glu Arg Gin Tyr Ala Lys Pro Leu Leu He Val Pro Pro Gin 
210 215 220 

He Asn Lys Tyr Tyr He Phe Asp Leu Ser Pro Glu Lys Ser Phe Val 
225 230 235 240 

Gin Tyr Ala Leu Lys Asn Asn Leu Gin Val Phe Val He Ser Trp Arg 

245 250 255 

Asn Pro Asp Ala Gin His Arg Glu Trp Gly Leu Ser Thr Tyr Val Glu 

260 265 270 

Ala Leu Asp Gin Ala He Glu Val Ser Arg Glu He Thr Gly Ser Arg 
275 280 285 

Ser Val Asn Leu Ala Gly Ala Cys Ala Gly Gly Leu Thr Val Ala Ala 
290 295 300 

Leu Leu Gly His Leu Gin Val Arg Arg Gin Leu Arg Lys Val Ser Ser 
305 310 315 320 



Val Thr Tyr Leu Val Ser Leu Leu Asp Ser Gin Met Glu Ser Pro Ala 
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325 330 335 

Met Leu Phe Ala Asp Glu Gin Thr Leu Glu Ser Ser Lys Arg Arg Ser 

340 345 350 

5 

Tyr. Gin His Gly Val Leu Asp Gly Arg Asp Met Ala Lys Val Phe Ala 
355 360 365 

Trp Met Arg Pro Asn Asp Leu lie Trp Asn Tyr Trp Val Asn Asn Tyr 
10 370 375 380 

Leu Leu Gly Arg Gin Pro Pro Ala Phe Asp lie Leu Tyr Trp Asn Asn 
385 390 395 400 

15 Asp Asn Thr Arg Leu Pro Ala Ala Phe His Gly Glu Leu Leu Asp Leu 

405 410 415 



20 



35 



50 



55 



Phe Lys His Asn Pro Leu Thr Arg Pro Gly Ala Leu Glu Val Ser Gly 

420 425 430 

Thr Ala Val Asp Leu Gly Lys Val Ala lie Asp Ser Phe His Val Ala 
435 440 445 



Gly lie Thr Asp His lie Thr Pro Trp Asp Ala Val Tyr Arg Ser Ala 
25 450 455 460 

Leu Leu Leu Gly Gly Gin Arg Arg Phe lie Leu Ser Asn Ser Gly His 
465 470 475 480 

30 lie Gin Ser lie Leu Asn Pro Pro Gly Asn Pro Lys Ala Cys Tyr Phe 

485 490 495 



Glu Asn Asp Lys Leu Ser Ser Asp Pro Arg Ala Trp Tyr Tyr Asp Ala 

500 505 510 

Lys Arg Glu Glu Gly Ser Trp Trp Pro Val Trp Leu Gly Trp Leu Gin 
515 520 525 



Glu Arg Ser Gly Glu Leu Gly Asn Pro Asp Phe Asn Leu Gly Ser Ala 
40 530 535 540 

Ala His Pro Pro Leu Glu Ala Ala Pro Gly Thr Tyr Val His lie Arg 

550 555 560 



45 Ser Lys Ala Leu Gly Lys Gly Val Thr Glu Glu Gin Phe Lys Glu Thr 

565 570 575 



Trp Thr Arg Pro Gly Ala Ala Gly Met Gly Glu Gly Thr Ser Leu Val 

580 585 590 

Val Ala Lys Ser Arg Met 
595 

(2) INFORMATION FOR SEQ ID NO: 21: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 2737 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



10 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
GAATTCATGT CTCCAGTTGA TTTTAAAGAT AAAGTTGTGA TCATTACCGG TGCCGGTGGT 
GGTTTGGGTA AATACTACTC CCTCGAATTT GCCAAGTTGG GCGCCAAAGT CGTCGTTAAC 
GACTTGGGTG GTGCCTTGAA CGGTCAAGGT GGAAACTCCA AGGCCGCCGA CGTTGTCGTT 
GACGAAATTG TCAAGAACGG TGGTGTTGCC GTTG CCGATT ACAACAACGT CTTGGACGGT 



20 GACAAGATTG TCGAAACCGC CGTCAAGAAC TTTGGTACTG TCCACGTTAT CATCAACAAT 
GCCGGTATCT TGAGAGATGC CTCCATGAAG AAGATGACTG AAAAAGACTA CAAATTGGTC 



25 



ATTGACGTGC ACTTGAACGG TGCCTTTGCC GTCACCAAGG CTGCTTGGCC ATACTTCCAA 
AAGCAAAAAT ACGGTAGAAT TGTCAACACA TCCTCCCCAG CTGGTTTGTA CGGTAACTTT 
GGTCAAGCCA ACTACGCCTC CGCCAAGTCT GCTTTGTTGG GATTCGCTGA AACCTTGGCC 



30 AAGGAAGGTG CCAAATACAA CATCAAGGCC AACGCCATTG CTCCGTTGGC CAGATCAAGA 
ATGACTGAAT CTATCTTGCC ACCTCCAATG TTGGAAAAAT TGGGCCCTGA AAAGGTTGCC 



35 



CCATTGGTCT TGTATTTGTC GTCAGCTGAA AACGAATTGA CTGGTCAATT CTTTGAAGTT 
GCTGCTGGCT TTTACGCTCA GATCAGATGG GAAAGATCCG GTGGTGTCTT GTTCAAGCCA 
GATCAATCCT TCACCGCTGA GGTTGTTGCT AAG AG ATT CT CTGAAATCCT TGATTATGAC 



40 GACTCTAGGA AGCCAGAATA CTTGAAGAAC CAATACCCAT TCATGTTGAA CGACTACGCC 
ACTTTGACCA ACGAAGCTAG AAAGTTGCCA GCTAACGATG CTTCTGGTGC TCCAACTGTC 



60 
120 
180 
240 ( 
300 
360 
420 
480 
540 
600 
660 
720 
780 

\ 

840 
900 
960 



45 



TCCTTGAAGG ACAAGGTTGT TTTGATCACC GGTGCCGGTG CTGGTTTGGG TAAAGAATAC 1020 

GCCAAGTGGT TCGCCAAGTA CGGTGCCAAG GTTGTTGTTA ACGACTTCAA GGATGCTACC 1080 

AAGACCGTTG ACGAAATCAA AGCCGCTGGT GGTGAAGCTT GGCCAGATCA ACACGATGTT 1140 

50 GCCAAGGACT CCGAAGCTAT CATCAAGAAT GTCATTGACA AGTACGGTAC CATTGATATC 1200 

TTGGTCAACA ACGCCGGTAT CTTGAGAGAC AGATCCTTTG CCAAGATGTC CAAGCAAGAA 1260 



55 



TGGGACTCTG TCCAACAAGT CCACTTGATT GGTACTTTCA ACTTGAGCAG ATTGGCATGG 1320 



CCATACTTTG TTGAAAAACA ATTTGGTAGA ATCATCAACA TTACCTCCAC CAGTGGTATC 
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TACGGTAACT TTGGTCAAGC CAACTACTCG TCTTCTAAGG CTGGTATCTT GGGTTTGTCC X44 0 

AAGACCATGG CCATTGAAGG TGCTAAGAAT AACATTAAGG TCAACATTGT TGCTCCACAC 1500 

GCTGAAACTG CCATGACCTT GACCATCTTC AGAGAACAAG ACAAGAACTT GTACCACGGT 1560 

GACCAAGTTG CTCCATTGTT GGTCTACTTG GGTACTGACG ATGTCCCAGT CACCGGTGAA 1620 

ACTTCCGAAA TCGGTGGTGG TTGGATCGGT AACACCAGAT GGCAAAGAGC CAAGGGTGCT 1680 

GTCTCCCACG ACGAACACAC CACTGTTGAA TTCATCAAGG AGCACTTGAA CGAAATCACT 1740 

GACTTCACCA CTGACACTGA AAATCCAAAA TCTACCACCG AATCCTCCAT GGCTATCTTG 1800 

TCTGCCGTTG GTGGTGATGA CGATGATGAT GACGAAGACG AAGAAGAAGA CGAAGGTGAT 1860 

GAAGAAGAAG ACGAAGAAGA CGAAGAAGAA GACGATCCAG TCTGGAGATT CGACGACAGA 1920 

GATGTTATCT TGTACAACAT TGCCCTTGGT GCCACCACCA AGCAATTGAA GTACGTCTAC 1980 

GAAAACGACT CTGACTTCCA AGTCATTCCA ACCTTTGGTC ACTTGATCAC CTTCAACTCT 2040 

GGTAAGTCAC AAAACTCCTT TGCCAAGTTG TTGCGTAACT TCAACCCAAT GTTGTTGTTG 2100 

CACGGTGAAC ACTACTTGAA GGTGCACAGC TGGCCACCAC CAACCGAAGG TGAAATCAAG 2160 

AC CACTTTCG AACCAATTGC CACTACTCCA AAGGGTACCA ACGTTGTTAT TGTTCACGGT 2220 

TCCAAATCTG TTGACAACAA GTCTGGTGAA TTGATTTACT CCAACGAAGC CACTTACTTC .2280 

ATCAGAAACT GTCAAGCCGA CAACAAGGTC TACGCTGACC GTCCAG CATT CGCCACCAAC 2340 

CAATTCTTGG CACCAAAGAG AGCCCCAGAC TACCAAGTTG ACGTTCCAGT CAGTGAAGAC 2400 

TTGGCTGCTT TGTACCGTTT GTCTGGTGAC AGAAACCCAT TGCACATTGA TCCAAACTTT 2460 

GCTAAAGGTG CCAAGTTCCC TAAGCCAATC TTACACGGTA TGTGCACTTA TGGTTTGAGT 2520 

GCTAAGGCTT TGATTGACAA GTTTGGTATG TTCAACGAAA TCAAGGCCAG ATTCACCGGT 2580 

ATTGTCTTCC CAGGTGAAAC CTTGAGAGTC TTGGCATGGA AGGAAAGCGA TGACACTATT 2640 

GTCTTCCAAA CTCATGTTGT TGATAGAGGT ACTATTGCCA TTAACAACGC TGCTATTAAG 2700 

TTAGTCGGTG ACAAAGCAAA GATCTAATGA AGGATCC 2737 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 906 cunino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

5 

Met Ser Pro Val Asp Phe Lys Asp Lys Val Val lie lie Thr Gly Ala 
15 10 15 

Gly Gly Gly Leu Gly Lys Tyr Tyr Ser Leu Glu Phe Ala Lys Leu Gly 
10 20 25 30 

Ala Lys Val Val Val Asn Asp Leu Gly Gly Ala Leu Asn Gly Gin Gly 
35 40 45 

15 Gly Asn Ser Lys Ala Ala Asp Val Val Val Asp Glu lie Val Lys Asn 

50 55 60 

Gly Gly Val Ala Val Ala Asp Tyr Asn Asn Val Leu Asp Gly Asp Lys 
65 70 75 80 



20 



35 



50 



He Val Glu Thr Ala Val Lys Asn Phe Gly Thr Val His Val He He 

85 90 95 



Asn Asn Ala Gly He Leu Arg Asp Ala Ser Met Lys Lys Met Thr Glu 
25 100 105 110 

Lys Asp Tyr Lys Leu Val He Asp Val His Leu Asn Gly Ala Phe Ala 
115 120 125 

30 Val Thr Lys Ala Ala Trp Pro Tyr Phe Gin Lys Gin Lys Tyr Gly Arg 

130 135 140 

He Val Asn Thr Ser Ser Pro Ala Gly Leu Tyr Gly Asn Phe Gly Gin 
145 ISO 155 160 



Ala Asn Tyr Ala Ser Ala Lys Ser Ala Leu Leu Gly Phe Ala Glu Thr 

165 170 175 



Leu Ala Lys Glu Gly Ala Lys Tyr Asn He Lys Ala Asn Ala He 
40 180 185 190 

Pro Leu Ala Arg Ser Arg Met Thr Glu Ser He Leu Pro Pro Pro Met 
195 200 205 

45 Leu Glu Lys Leu Gly Pro Glu Lys Val Ala Pro Leu Val Leu Tyr Leu 

210 215 220 

Ser Ser Ala Glu Asn Glu Leu Thr Gly Gin Phe Phe Glu Val Ala Ala 
225 230 235 240 



Gly Phe Tyr Ala Gin He Arg Trp Glu Arg Ser Gly Gly Val Leu Phe 

245 250 255 



Lys Pro Asp Gin Ser Phe Thr Ala Glu Val Val Ala Lys Arg Phe Ser 
55 260 265 270 
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Glu lie Leu Asp Tyr Asp Asp Ser Arg Lys Pro Glu Tyr Leu Lys Asn 
275 280 285 

Gin Tyr Pro Phe Met Leu Asn Asp Tyr Ala Thr Leu Thr Asn Glu Ala 
290 295 300 

Arg Lys Leu Pro Ala Asn Asp Ala Ser Gly Ala Pro Thr Val Ser Leu 
305 310 315 320 

Lys Asp Lys Val Val Leu lie Thr Gly Ala Gly Ala Gly Leu Gly Lys 

325 330 335 

Glu Tyr Ala Lys Trp Phe Ala Lys Tyr Gly Ala Lys Val Val Val Asn 

340 345 350 

Asp Phe Lys Asp Ala Thr Lys Thr Val Asp Glu lie Lys Ala Ala Gly 
355 360 365 

Gly Glu Ala Trp Pro Asp Gin His Asp Val Ala Lys Asp Ser Glu Ala 
370 375 380 

lie lie Lys Asn Val lie Asp Lys Tyr Gly Thr lie Asp lie Leu Val 
385 390 395 400 

Asn Asn Ala Gly lie Leu Arg Asp Arg Ser Phe Ala Lys Met Ser Lys 

405 410 415 

Gin Glu Trp Asp Ser Val Gin Gin Val His Leu lie Gly Thr Phe Asn 

420 425 430 

Leu Ser Arg Leu Ala Trp Pro Tyr Phe Val Glu Lys Gin Phe Gly Arg 
435 440 445 

lie lie Asn lie Thr Ser Thr Ser Gly lie Tyr Gly Asn Phe Gly Gin 
450 455 460 



Ala Asn Tyr Ser Ser Ser Lys Ala Gly lie Leu Gly Leu Ser Lys Thr 
465 470 475 460 

Met Ala lie Glu Gly Ala Lys Asn Asn lie Lys Val Asn lie Val Ala 

485 490 495 

Pro His Ala Glu Thr Ala Met Thr Leu Thr lie Phe Arg Glu Gin Asp 

500 505 510 

Lys Asn Leu Tyr His Ala Asp Gin Val Ala Pro Leu Leu Val Tyr Leu 
515 520 525 

Gly Thr Asp Asp Val Pro Val Thr Gly Glu Thr Ser Glu lie Gly Gly 
530 535 540 

Gly Trp lie Gly Asn Thr Arg Trp Gin Arg Ala Lys Gly Ala Val Ser 
545 550 555 560 

His Asp Glu His Thr Thr Val Glu Phe lie Lys Glu His Leu Asn Glu 

570 575 
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lie Thr Asp Phe Thr Thr Asp Thr Glu Asn Pro Lys Ser Thr Thr Glu 

580 585 590 

Ser Ser Met Ala lie Leu Ser Ala Val Gly Gly Asp Asp Asp Asp Asp 
595 600 605 

Asp Glu Asp Glu Glu Glu Asp Glu Gly Asp Glu Glu Glu Asp Glu Glu 
610 615 620 

Asp Glu Glu Glu Asp Asp Pro Val Trp Arg Phe Asp Asp Arg Asp Val 
625 630 635 640 

lie Leu Tyr Asn lie Ala Leu Gly Ala Thr Thr Lys Gin Leu Lys Tyr 
15 645 650 655 

Val Tyr Glu Asn Asp Ser Asp Phe Gin Val He Pro Thr Phe Gly His 

660 665 670 

20 Leu He Thr Phe Asn Ser Gly Lys Ser Gin Asn Ser Phe Ala Lys Leu 

675 680 685 



10 



25 



40 



55 



Leu Arg Asn Phe Asn Pro Met Leu Leu Leu His Gly Glu His Tyr Leu 
690 695 700 

Lys Val His Ser Trp Pro Pro Pro Thr Glu Gly Glu He Lys Thr Thr 

705 710 715 720 



Phe Glu Pro He Ala Thr Thr Pro Lys Gly Thr Asn Val Val He Val 
30 725 730 735 

His Gly Ser Lys Ser Val Asp Asn Lys Ser Gly Glu Leu He Tyr Ser 

740 745 750 

35 Asn Glu Ala Thr Tyr Phe He Arg Asn Cys Gin Ala Asp Asn Lys Val 

755 760 765 



Tyr Ala Asp Arg Pro Ala Phe Ala Thr Asn Gin Phe Leu Ala Pro Lys 
770 775 780 

Arg Ala Pro Asp Tyr Gin Val Asp Val Pro Val Ser Glu Asp Leu Ala 
785 790 795 800 

Ala Leu Tyr Arg Leu Ser Gly Asp Arg Asn Pro Leu His He Asp Pro 
45 805 810 815 

Asn Phe Ala Lys Gly Ala Lys Phe Pro Lys Pro He Leu His Gly Met 

820 825 830 

50 Cys Thr Tyr Gly Leu Ser Ala Lys Ala Leu He Asp Lys Phe Gly Met 

835 840 845 



Phe Asn Glu He Lys Ala Arg Phe Thr Gly He Val Phe Pro Gly Glu 
850 855 860 

Thr Leu Arg Val Leu Ala Trp Lys Glu Ser Asp Asp Thr He Val Phe 
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865 870 875 880 

Gin Thr His Val Val Asp Arg Gly Thr lie Ala lie Asn Asn Ala Ala 

8B5 890 895 

lie Lys Leu Val Gly Asp Lys Ala Lys lie 

900 905 

(2) INFORMATION FOR SEQ ID NO: 23: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2737 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 



20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

GGATCCATGT CTCCAGTTGA TTTTAAAGAT AAAGTTGTGA TCATTACCGG TGCCGGTGGT 60 

25 GGTTTGGGTA AATACTACTC CCTCGAATTT GCCAAGTTGG GCGCCAAAGT CGTCGTTAAC 120 

GACTTGGGTG GTGCCTTGAA CGGTCAAGGT GGAAACTCCA AGGCCGCCGA CGTTGTCGTT 180 

GACGAAATTG TCAAGAACGG TGGTGTTGCC GTTGCCGATT ACAACAACGT CTTGGACGGT 240 

30 

GACAAGATTG TCGAAACCGC CGTCAAGAAC TTTGGTACTG TCCACGTTAT CATCAACAAT 300 

GCCGGTATCT TGAGAGATGC CTCCATGAAG AAGATGACTG AAAAAGACTA CAAATTGGTC 360 

35 ATTGACGTGC ACTTGAACGG TGCCTTTGCC GTCACCAAGG CTGCTTGGCC ATACTTCCAA 420 

AAGCAAAAAT ACGGTAGAAT TGTCAACACA TCCTCCCCAG CTGGTTTGTA CGGTAACTTT 480 

GGTCAAGCCA ACTACGCCTC CGCCAAGTCT GCTTTGTTGG GATTCGCTGA AACCTTGGCC 540 

40 

AAGGAAGGTG CCAAATACAA CATCAAGGCC AACGCCATTG CTCCGTTGGC CAGATCAAGA 600 

ATGACTGAAT CTATCTTGCC ACCTCCAATG TTGGAAAAAT TGGGCCCTGA AAAGGTTGCC 660 

45 CCATTGGTCT TGTATTTGTC GTCAGCTGAA AACGAATTGA CTGGTCAATT CTTTGAAGTT 720 

GCTGCTGGCT TTTACGCTCA GATCAGATGG GAAAGATCCG GTGGTGTCTT GTTCAAGCCA 780 

GATCAATCCT TCACCGCTGA GGTTGTTGCT AAGAGATTCT CTGAAATCCT TGATTATGAC 840 

50 

GACTCTAGGA AGCCAGAATA CTTGAAGAAC CAATACCCAT TCATGTTGAA CGACTACGCC 900 

ACTTTGACCA ACGAAGCTAG AAAGTTGCCA GCTAACGATG CTTCTGGTGC TCCAACTGTC 960 

55 TCCTTGAAGG ACAAGGTTGT TTTGATCACC GGTGCCGGTG CTGGTTTGGG TAAAGAATAC 1020 
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GCCAAGTGGT TCGCCAAGTA CGGTGCCAAG 
AAGACCGTTG ACGAAATCAA AGCCGCTGGT 
5 GCCAAGGACT CCGAAGCTAT CATCAAGAAT 
TTGGTCAACA ACGCCGGTAT CTTGAGAGAC 
TGGGACTCTG TCCAACAAGT CCACTTGATT 

10 

CCATACTTTG TTGAAAAACA ATTTGGTAGA 
TACGGTAACT TTGGTCAAGC CAACTACTCG 
15 AAGACCATGG CCATTGAAGG TGCTAAGAAT 
GCTGAAACTG CCATGACCTT GACCATCTTC 
GACCAAGTTG CTCCATTGTT GGTCTACTTG 

20 

ACTTCCGAAA TCGGTGGTGG TTGGATCGGT 
GTCTCCCACG ACGAACACAC CACTGTTGAA 
25 GACTTCACCA CTGACACTGA AAATCCAAAA 
TCTGCCGTTG GTGGTGATGA CGATGATGAT 
GAAGAAGAAG ACGAAGAAGA CGAAGAAGAA 

30 

GATGTTATCT TGTACAACAT TGCCCTTGGT 
GAAAACGACT CTGACTTCCA AGTCATTCCA 
35 GGTAAGTCAC AAAACTCCTT TGCCAAGTTG 
CACGGTGAAC ACTACTTGAA GGTGCACAGC 
ACCACTTTCG AACCAATTGC CACTACTCCA 

40 

TCCAAATCTG TTGACAACAA GTCTGGTGAA 
ATCAGAAACT GTCAAGCCGA CAACAAGGTC 
45 CAATTCTTGG CACCAAAGAG AGCCCCAGAC 
TTGGCTGCTT TGTACCGTTT GTCTGGTGAC 
GCTA7VAGGTG CCAAGTTCCC TAAGCCAATC 

50 

GCTAAGGCTT TGATTGACAA GTTTGGTATG 
ATTGTCTTCC CAGGTGAAAC CTTGAGAGTC 
55 GTCTTCCAAA CTCATGTTGT TGATAGAGGT 
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GTTGTTGTTA ACGACTTCAA GGATGCTACC 1080 

GGTGAAGCTT GGCCAGATCA ACACGATGTT 1140 " 

GTCATTGACA AGTACGGTAC CATTGATATC 1200 

AGATCCTTTG CCAAGATGTC CAAGCAAGAA 1260 

GGTACTTTCA ACTTGAGCAG ATTGGCATGG 1320 

ATCATCAACA TTACCTCCAC CAGTGGTATC 1380 

TCTTCTAAGG CTGGTATCTT GGGTTTGTCC 1440 

AACATTAAGG TCAACATTGT TGCTCCACAC 1500 

AGAGAACAAG ACAAGAACTT GTACCACGCT 1560 

GGTACTGACG ATGTCCCAGT CACCGGTGAA 1620 

AACACCAGAT GGCAAAGAGC CAAGGGTGCT 1680 

TTCATCAAGG AG CACTTG AA CGAAATCACT 1740 

TCTACCACCG AATCCTCCAT GGCTATCTTG 1800 

GACGAAGACG AAGAAGAAGA CGAAGGTGAT 1860 

GACGATCCAG TCTGGAGATT CGACGACAGA 1920 

GCCACCACCA AGCAATTGAA GTACGTCTAC 1980 

ACCTTTGGTC ACTTGATCAC CTTCAACTCT 2040 

TTGCGTAACT TCAACCCAAT GTTGTTGTTG 2100 

TGGCCACCAC CAACCGAAGG TGAAATCAAG 2160 i 

AAGGGTACCA ACGTTGTTAT TGTTCACGGT 2220 

TTGATTTACT CCAACGAAGC CACTTACTTC 2280 

TACGCTGACC GTCCAGCATT CGCCACCAAC 2340 

TACCAAGTTG ACGTTCCAGT CAGTGAAGAC 2400 

AGAAACCCAT TGCACATTGA TCCAAACTTT 2460 

TTACACGGTA TGTGCACTTA TGGTTTGAGT 2520 

TTCAACGAAA TCAAGGCCAG ATTCACCGGT 2580 

TTGGCATGGA AGGAAAGCGA TGACACTATT 2640 

ACTATTGCCA TTAACAACGC TGCTATTAAG 2700 
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TTAGTCGGTG ACAAATCCAA GTTGTAATGA AGGATCC 2 737 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 906 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Met Ser Pro Val Asp Phe Lys Asp Lys Val Val lie lie Thr Gly Ala 
15 10 15 

Gly Gly Gly Leu Gly Lys Tyr Tyr Ser Leu Glu Phe Ala Lys Leu Gly 

20 25 30 

Ala Lys Val Val Val Asn Asp Leu Gly Gly Ala Leu Asn Gly Gin Gly 
35 40 45 

Gly Asn Ser Lys Ala Ala Asp Val Val Val Asp Glu lie Val Lys Asn 
50 55 60 

Gly Gly Val Ala Val Ala Asp Tyr Asn Asn Val Leu Asp Gly Asp Lys 
65 70 75 80 

He Val Glu Thr Ala Val Lys Asn Phe Gly Thr Val His Val He lie 

85 90 95 

Asn Asn Ala Gly He Leu Arg Asp Ala Ser Met Lys Lys Met Thr Glu 

100 105 110 

Lys Asp Tyr Lys Leu Val He Asp Val His Leu Asn Gly Ala Phe Ala 
115 120 125 

Val Thr Lys Ala Ala Trp Pro Tyr Phe Gin Lys Gin Lys Tyr Gly Arg 
130 135 140 

He Val Asn Thr Ser Ser Pro Ala Gly Leu Tyr Gly Asn Phe Gly Gin 
145 150 155 160 

Ala Asn Tyr Ala Ser Ala Lys Ser Ala Leu Leu Gly Phe Ala Glu Thr 

165 170 175 

Leu Ala Lys Glu Gly Ala Lys Tyr Asn He Lys Ala Asn Ala He Ala 

180 185 190 

Pro Leu Ala Arg Ser Arg Met Thr Glu Ser He Leu Pro Pro Pro Met 

200 205 



Leu Glu Lys Leu Gly Pro Glu Lys Val Ala Pro Leu Val Leu Tyr Leu 
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210 



215 



220 



Ser Ser Ala Glu Asn Glu Leu Thr Gly Gin Phe Phe Glu Val Ala Ala 
225 230 235 240 

Gly Phe Tyr Ala Gin He Arg Trp Glu Arg Ser Gly Gly Val Leu Phe 

245 250 255 

Lys Pro Asp Gin Ser Phe Thr Ala Glu Val Val Ala Lys Arg Phe Ser 

260 265 270 

Glu He Leu Asp Tyr Asp Asp Ser Arg Lys Pro Glu Tyr Leu Lys Asn 
275 280 285 

Gin Tyr Pro Phe Met Leu Asn Asp Tyr Ala Thr Leu Thr Asn Glu Ala 
290 295 300 

Arg Lys Leu Pro Ala Asn Asp Ala Ser Gly Ala Pro Thr Val Ser Leu 
305 310 315 320 

Lys Asp Lys Val Val Leu He Thr Gly Ala Gly Ala Gly Leu Gly Lys 

325 330 335 

Glu Tyr Ala Lys Trp Phe Ala Lys Tyr Gly Ala Lys Val Val Val Asn 

340 345 350 

Asp Phe Lys Asp Ala Thr Lys Thr Val Asp Glu He Lys Ala Ala Gly 
355 , 360 365 

* 

Gly Glu Ala Trp Pro Asp Gin His Asp Val Ala Lys Asp Ser Glu Ala 
370 375 380 

He He Lys Asn Val He Asp Lys Tyr Gly Thr He Asp He Leu Val 
385 390 395 400 

Asn Asn Ala Gly He Leu Arg Asp Arg Ser Phe Ala Lys Met Ser Lys 

405 410 415 



Gin Glu Trp Asp Ser Val Gin Gin Val His Leu He Gly Thr Phe Asn 

420 425 430 



Leu Ser Arg Leu Ala Trp Pro Tyr Phe Val Glu Lys Gin Phe Gly Arg 
435 440 445 

He He Asn He Thr Ser Thr Ser Gly He Tyr Gly Asn Phe Gly Gin 
-450 455 460 

Ala Asn Tyr Ser Ser Ser Lys Ala Gly He Leu Gly Leu Ser Lys Thr 
465 470 475 480 

Met Ala He Glu Gly Ala Lys Asn Asn He Lys Val Asn He Val Ala 

485 490 495 



Pro His Ala Glu Thr Ala Met Thr Leu Thr He Phe Arg Glu Gin Asp 

500 505 510 
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Lys Asn Leu Tyr His Ala Asp Gin Val Ala Pro Leu Leu Val Tyr Leu 
515 520 525 

Gly Thr Asp Asp Val Pro Val Thr Gly Glu Thr Ser Glu lie Gly Gly 
530 535 540 



Gly Trp lie Gly Asn 
545 

His Asp Glu His Thr 

565 

lie Thr Asp Phe Thr 

580 

Ser Ser Met Ala lie 
595 

Asp Glu Asp Glu Glu 
610 

Asp Glu Glu Glu Asp 
625 

lie Leu Tyr Asn lie 

645 

Val Tyr Glu Asn Asp 

660 

Leu lie Thr Phe Asn 
675 



Thr Arg Trp Gin Arg Ala 
550 555 

Thr Val Glu Phe He Lys 

570 

Thr Asp Thr Glu Asn Pro 

585 

Leu Ser Ala Val Gly Gly 
600 

Glu Asp Glu Gly Asp Glu 
615 

Asp Pro Val Trp Arg Phe 
630 635 

Ala Leu Gly Ala Thr Thr 

650 

Ser Asp Phe Gin Val He 

665 

Ser Gly Lys Ser Gin Asn 
680 



Lys Gly Ala Val Ser 

560 

Glu His Leu Asn Glu 

575 

Lys Ser Thr Thr Glu 
590 

Asp Asp Asp Asp Asp 
605 

Glu Glu Asp Glu Glu 
620 

Asp Asp Arg Asp Val 

640 

Lys Gin Leu Lys Tyr 

655 

Pro Thr Phe Gly His 
670 

Ser Phe Ala Lys Leu 
685 



Leu Arg Asn Phe 
690 

Lys Val His Ser 
705 

Phe Glu Pro He 



Asn Pro Met: Leu 
695 

Trp Pro Pro Pro 
710 

Ala Thr Thr Pro 
725 



Leu Leu His Gly 

700 

Thr Glu Gly Glu 
715 

Lys Gly Thr Asn 
730 



Glu His Tyr Leu 



He Lys Thr Thr 

720 

Val Val He Val 
735 



His Gly Ser Lys Ser Val Asp Asn 

740 

Asn Glu Ala Thr Tyr Phe He Arg 
755 760 



Lys Ser Gly Glu Leu He Tyr Ser 
745 750 

Asn Cys Gin Ala Asp Asn Lys Val 

765 



Tyr Ala Asp Arg Pro 
770 

Arg Ala Pro Asp Tyr 
785 

Ala Leu Tyr Arg Leu 

805 



Ala Phe Ala Thr Asn Gin 
775 

Gin Val Asp Val Pro Val 
790 795 

Ser Gly Asp Arg Asn Pro 

810 
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Phe Leu Ala Pro Lys 
780 

Ser Glu Asp Leu Ala 

800 

Leu His He Asp Pro 
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Asn Phe Ala Lys 

820 



Gly Ala Lys Phe Pro Lys Pro He Leu His Gly Met 

825 830 



Cys Thr Tyr Gly 
835 



Leu Ser Ala Lys Ala Leu He Asp Lys Phe Gly Met 

840 845 



Phe Asn Glu He 
850 



Lys Ala Arg Phe Thr Gly He Val Phe Pro Gly Glu 
855 860 



Thr Leu Arg Val 
865 



Leu Ala Trp Lys Glu Ser Asp Asp Thr He Val Phe 
870 875 880 



Gin Thr His Val 



Val Asp Arg Gly Thr He Ala He Asn Asn Ala Ala 
885 890 895 



He Lys Leu Val 

900 



Gly Asp Lys Ser Lys Leu 

905 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2737 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

GGATCCATGT CTCCAGTTGA TTTTAAAGAT AAAGTTGTGA TCATTACCGG TGCCGGTGGT 60 

GGTTTGGGTA AATACTACTC CCTCGAATTT GCCAAGTTGG GCGCCAAAGT CGTCGTTAAC 120 

GACTTGGGTG GTGCCTTGAA CGGTCAAGGT GGAAACTCCA AGGCCGCCGA CGTTGTCGTT 180 

GACGAAATTG TCAAGAACGG TGGTGTTGCC GTTGCCGATT ACAACAACGT CTTGGACGGT 240 

GACAAGATTG TCGAAACCGC CGTCAAGAAC TTTGGTACTG TCCACGTTAT CATCAACAAT 300 

GCCGGTATCT TGAGAGATGC CTCCATGAAG AAGATGACTG AAAAAGACTA CAAATTGGTC 360 

ATTGACGTGC ACTTGAACGG TGCCTTTGCC GTCACCAAGG CTGCTTGGCC ATACTTCCAA 420 

AAGCAAAAAT ACGGTAGAAT TGTCAACACA TCCTCCCCAG CTGGTTTGTA CGGTAACTTT 480 

GGTCAAGCCA ACTACGCCTC CGCCAAGTCT GCTTTGTTGG GATTCGCTGA AACCTTGGCC 540 

AAGGAAGGTG CCAAATACAA CATCAAGGCC AACGCCATTG CTCCGTTGGC CAGATCAAGA 600 

ATGACTGAAT CTATCTTGCC ACCTCCAATG TTGGAAAAAT TGGGCCCTGA AAAGGTTGCC 660 

CCATTGGTCT TGTATTTGTC GTCAGCTGAA AACGAATTGA CTGGTCAATT CTTTGAAGTT 720 
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GCTGCTGGCT TTTACGCTCA GATCAGATGG 
GATCAATCCT TCACCGCTGA GGTTGTTGCT 

5 

GACTCTAGGA AGCCAGAATA CTTGAAGAAC 
ACTTTGACCA ACGAAGCTAG AAAGTTGCCA 
10 TCCTTGAAGG ACAAGGTTGT TTTGATCACC 
GCCAAGTGGT TCGCCAAGTA CGGTGCCAAG 
AAGACCGTTG ACGAAATCAA AGCCGCTGGT 

15 

GCCAAGGACT CCGAAGCTAT CATCAAGAAT 
TTGGTCAACA ACGCCGGTAT CTTGAGAGAC 
20 TGGGACTCTG TCCAACAAGT CCACTTGATT 
CCATACTTTG TTGAAAAACA ATTTGGTAGA 
TACGGTAACT TTGGTCAAGC CAACTACTCG 

25 

AAGACCATGG CCATTGAAGG TGCTAAGAAT 
GCTGAAACTG CCATGACCTT GACCATCTTC 
30 GACCAAGTTG CTCCATTGTT GGTCTACTTG 
ACTTC CGAAA TCGGTGGTGG TTGGATCGGT 
GTCTCCCACG ACGAACACAC CACTGTTGAA 

35 

GACTTCACCA CTGACACTGA AAATCCAAAA 
TCTGCCGTTG GTGGTGATGA CGATGATGAT 
40 GAAGAAGAAG . ACGAAGAAGA CGAAGAAGAA 
GATGTTATCT TGTACAACAT TGCCCTTGGT 
GAAAACGACT CTGACTTCCA AGTCATTCCA 

45 

GGTAAGTCAC AAAACTCCTT TGCCAAGTTG 
CACGGTGAAC ACTACTTGAA GGTGCACAGC 
50 ACCACTTTCG AACCAATTGC CACTACTCCA 
TCCAAATCTG TTGACAACAA GTCTGGTGAA 
ATCAGAAACT GTCAAGCCGA CAACAAGGTC 

55 

CAATTCTTGG CACCAAAGAG AGCCCCAGAC 
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GAAAGATCCG GTGGTGTCTT GTTCAAGCCA 780 

AAGAGATTCT CTGAAATCCT TGATTATGAC 840 

CAATACCCAT TCATGTTGAA CGACTACGCC 900 

GCTAACGATG CTTCTGGTGC TCCAACTGTC 960 

GGTGCCGGTG CTGGTTTGGG TAAAGAATAC 1020 

GTTGTTGTTA ACGACTTCAA GGATGCTACC 1080 

GGTGAAGCTT GGCCAGATCA ACACGATGTT 1140 

GTCATTGACA AGTACGGTAC CATTGATATC 1200 

AGATCCTTTG CCAAGATGTC CAAGCAAGAA 1260 

GGTACTTTCA ACTTGAGCAG ATTGGCATGG 1320 

ATCATCAACA TTACCTCCAC CAGTGGTATC 1380 

TCTTCTAAGG CTGGTATCTT GGGTTTGTCC .1440 

AACATTAAGG TCAACATTGT TGCTCCACAC 1500 

AGAGAACAAG ACAAGAACTT GTACCACGCT 1560 

GGTACTGACG ATGTCCCAGT CACCGGTGAA 1620 

AACACCAGAT GGCAAAGAGC CAAGGGTGCT 1680 

TTCATCAAGG AGCACTTGAA CGAAATCACT 1740 

TCTACCACCG AATCCTCCAT GGCTATCTTG 1800 

GACGAAGACG AAGAAGAAGA CGAAGGTGAT 1860 

GACGATCCAG TCTGGAGATT CGACGACAGA 1 92 0 

GCCACCACCA AGCAATTGAA GTACGTCTAC 1980 

ACCTTTGGTC ACTTGATCAC CTTCAACTCT 2040 

TTGCGTAACT TCAACCCAAT GTTGTTGTTG 2100 

TGGCCACCAC CAACCGAAGG TGAAATCAAG 2160 

AAGGGTACCA ACGTTGTTAT TGTTCACGGT 2220 

TTGATTTACT CCAACGAAGC CACTTACTTC 2280 

TACGCTGACC GTCCAGCATT CGCCACCAAC 2340 

TACCAAGTTG ACGTTCCAGT CAGTGAAGAC 2400 
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TTGGCTGCTT TGTACCGTTT GTCTGGTGAC AGAAACCCAT TGCACATTGA TCCAAACTTT 2460 

GCTAAAGGTG CCAAGTTCCC TAAGCCAATC TTACACGGTA TGTGCACTTA TGGTTTGAGT 2520 

GCTAAGGCTT TGATTGACAA GTTTGGTATG TTCAACGAAA TCAAGGCCAG ATTCACCGGT 2580 

ATTGTCTTCC CAGGTGAAAC CTTGAGAGTC TTGGCATGGA AGGAAAGCGA TGACACTATT 2640 

GTCTTCCAAA CTCATGTTGT TGATAGAGGT ACTATTGCCA TTAACAACGC TGCTATTAAG 2700 

TTAGTCGGTG ACAAATGAAA GATCGAATGA AGGATCC 2737 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 903 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Ser Pro Val Asp Phe Lys Asp Lys Val Val He He Thr Gly Ala 
1 5 10 15 

Gly Gly Gly Leu Gly Lys Tyr Tyr Ser Leu Glu Phe Ala Lys Leu Gly 

20 25 30 

Ala Lys Val Val Val Asn Asp Leu Gly Gly Ala Leu Asn Gly Gin Gly 
35 40 45 

Gly Asn Ser Lys Ala Ala Asp Val Val Val Asp Glu He Val Lys Asn 
50 55 60 

Gly Gly Val Ala Val Ala Asp Tyr Asn Asn Val Leu Asp Gly Asp Lys 
65 70 75 80 

He Val Glu Thr Ala Val Lys Asn Phe Gly Thr Val His Val He He 

85 90 95 

Asn Asn Ala Gly He Leu Arg Asp Ala Ser Met Lys Lys Met Thr Glu 

100 105 110 

Lys Asp Tyr Lys Leu Val He Asp Val His Leu Asn Gly Ala Phe Ala 
115 120 125 

Val Thr Lys Ala Ala Trp Pro Tyr Phe Gin Lys Gin Lys Tyr Gly Arg 
130 135 140 

He Val Asn Thr Ser Ser Pro Ala Gly Leu Tyr Gly Asn Phe Gly Gin 
145 150 155 160 

-94- 



WO 99/35278 



PCT/US98/00083 



Ala Asn Tyr Ala Ser Ala Lys Ser Ala Leu Leu Gly Phe Ala Glu Thr 

165 170 175 

Leu Ala Lys Glu Gly Ala Lys Tyr Asn lie Lys Ala Asn Ala lie Ala 

180 185 190 

Pro Leu Ala Arg Ser Arg Met Thr Glu Ser lie Leu Pro Pro Pro Met 
195 200 205 

Leu Glu Lys Leu Gly Pro Glu Lys Val Ala Pro Leu Val Leu Tyr Leu 
210 215 220 

Ser Ser Ala Glu Asn Glu Leu Thr Gly Gin Phe Phe Glu Val Ala Ala 
225 230 235 240 

Gly Phe Tyr Ala Gin He Arg Trp Glu Arg Ser Gly Gly Val Leu Phe 

245 250 255 

Lys Pro Asp Gin Ser Phe Thr Ala Glu Val Val Ala Lys Arg Phe Ser 

260 265 270 

Glu He Leu Asp Tyr Asp Asp Ser Arg Lys Pro Glu Tyr Leu Lys Asn 
275 280 285 

Gin Tyr Pro Phe Met Leu Asn Asp Tyr Ala Thr Leu Thr Asn Glu Ala 
290 295 300 

Arg Lys Leu Pro Ala Asn Asp Ala Ser Gly Ala Pro Thr Val Ser Leu 
305 310 315 320 

Lys Asp Lys Val Val Leu He Thr Gly Ala Gly Ala Gly Leu Gly Lys 

325 330 335 

Glu Tyr Ala Lys Trp Phe Ala Lys Tyr Gly Ala Lys Val Val Val Asn 

340 345 350 

Asp Phe Lys Asp Ala Thr Lys Thr Val Asp Glu He Lys Ala Ala Gly 
355 360 



Gly Glu Ala Trp Pro Asp Gin His Asp Val Ala Lys Asp Ser Glu Ala 
370 375 380 

He He Lys Asn Val He Asp Lys Tyr Gly Thr He Asp He Leu Val 
385 390 395 400 

Asn Asn Ala Gly He Leu Arg Asp Arg Ser Phe Ala Lys Met Ser Lys 

405 410 415 

Gin Glu Trp Asp Ser Val Gin Gin Val His Leu He Gly Thr Phe Asn 

420 425 430 

Leu Ser Arg Leu Ala Trp Pro Tyr Phe Val Glu Lys Gin Phe Gly Arg 
435 440 445 



He He Asn He Thr Ser Thr Ser Gly He Tyr Gly Asn Phe Gly Gin 
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Ala Asn Tyr Ser Ser Ser Lys Ala Gly lie Leu Gly Leu Ser Lys Thr 
465 470 475 480 

Met Ala lie Glu Gly Ala Lys Asn Asn lie Lys Val Asn lie Val Ala 

485 490 495 

Pro His Ala Glu Thr Ala Met Thr Leu Thr lie Phe Arg Glu Gin Asp 

500 505 510 

Lys Asn Leu Tyr His Ala Asp Gin Val Ala Pro Leu Leu Val Tyr Leu 
515 520 525 

Gly Thr Asp Asp Val Pro Val Thr Gly Glu Thr Ser Glu lie Gly Gly 
530 535 540 

Gly Trp lie Gly Asn Thr Arg Trp Gin Arg Ala Lys Gly Ala Val Ser 
545 550 555 560 

His Asp Glu His Thr Thr Val Glu Phe lie Lys Glu His Leu Asn Glu 

565 570 575 

lie Thr Asp Phe Thr Thr Asp Thr Glu Asn Pro Lys Ser Thr Thr Glu 

580 585 590 

Ser Ser Met Ala lie Leu Ser Ala Val Gly Gly Asp Asp Asp Asp Asp 
595 . 600 605 

Asp Glu Asp Glu Glu Glu Asp Glu Gly Asp Glu Glu Glu Asp Glu Glu 
610 615 620 

Asp Glu Glu Glu Asp Asp Pro Val Trp Arg Phe Asp Asp Arg Asp Val 
625 630 635 640 

lie Leu Tyr Asn lie Ala Leu Gly Ala Thr Thr Lys Gin Leu Lys Tyr 

645 650 655 



Val Tyr Glu Asn 

660 

Leu lie Thr Phe 
675 

Leu Arg Asn Phe 
690 

Lys Val His Ser 
705 

Phe Glu Pro lie 



His Gly Ser Lys 

740 



Asp Ser Asp Phe 



Asn Ser Gly Lys 

680 

Asn Pro Met Leu 
695 

Trp Pro Pro Pro 
710 

Ala Thr Thr Pro 
725 

Ser Val Asp Asn 



Gin Val lie Pro 
665 

Ser Gin Asn Ser 



Leu Leu His Gly 

700 

Thr Glu Gly Glu 
715 

Lys Gly Thr Asn 
730 

Lys Ser Gly Glu 
745 



Thr Phe Gly His 
670 

Phe Ala Lys Leu 
685 

Glu His Tyr Leu 



lie Lys Thr Thr 

720 

Val Val lie Val 
735 

Leu lie Tyr Ser 
750 



-96- 



WO 99/35278 



PCT/US98/00083 



Asn Glu Ala Thr 
755 

Tyr Ala Asp Arg 
5 770 

Arg Ala Pro Asp 
785 

10 Ala Leu Tyr Arg 



Asn Phe Ala Lys 

820 

IS 

Cys Thr Tyr Gly 
835 

Phe Asn Glu lie 
20 850 

Thr Leu Arg Val 
865 

25 Gin Thr His Val 



lie Lys Leu Val 

900 



Tyr Phe lie Arg Asn Cys 

760 

Pro Ala Phe Ala Thr Asn 
775 

Tyr Gin Val Asp Val Pro 
790 

Leu Ser Gly Asp Arg Asn 
805 810 

Gly Ala Lys Phe Pro Lys 

825 

Leu Ser Ala Lys Ala Leu 

840 

Lys Ala Arg Phe Thr Gly 
855 

Leu Ala Trp Lys Glu Ser 
870 

Val Asp Arg Gly Thr lie 
885 890 

Gly Asp Lys 



Gin Ala Asp Asn Lys Val 
765 

Gin Phe Leu Ala Pro Lys 
780 

Val Ser Glu Asp Leu Ala 
795 800 

Pro Leu His lie Asp Pro 

815 

Pro lie Leu His Gly Met 

830 

lie Asp Lys Phe Gly Met 
845 

lie Val Phe Pro Gly Glu 
860 

Asp Asp Thr lie Val Phe 
875 880 

Ala He Asn Asn Ala Ala 

895 
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WHAT IS CLAIMED IS; 

1. A non-naturally occurring fusion protein comprising: 
a peroxisome targeting protein subunit; and 

a polyhydroxyalkanoate synthase protein subunit. 

2. The fusion protein of claim 1, wherein the peroxisome targeting subunit is PTS2. 

3. The fusion protein of claim 1, wherein the peroxisome targeting subunit comprises a 
tripeptide, wherein: 

the first amino acid in the N-terminus to C-terminus direction is S, A, or P; 
the second amino acid in the N-terminus to C-terminus direction is K, R, S, or H; 
and 

the third amino acid in the N-terminus to C-terminus direction is L, M, I, or F. 

4. The fusion protein of claim 3, wherein the peroxisome targeting subunit comprises 
ARM, SRM, SKL, ARL, SRL, PSI, or PRM. 

5. The fusion protein of claim 1 , wherein the peroxisome targeting subunit is at least 
70% identical to SEQ ID NO: 14. 

6. The fusion protein of claim 5, wherein the peroxisome targeting protein subunit is at 
least 80% identical to SEQ ID NO: 1 4. 

7. The fusion protein of claim 6, wherein the peroxisome targeting protein subunit is at 
least 90% identical to SEQ ID NO: 14. 

8. The fusion protein of claim 7, wherein the peroxisome targeting protein subunit is 
SEQ ID NO: 14. 

9. The fusion protein of claim 1, wherein the polyhydroxyalkanoate synthase protein 
subunit is a Pseudomonas subunit 
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10. The fusion protein of claim 9, wherein the Pseudomonas subunit is a Pseudomonas 
aeruginosa subunit. 

1 1 . The fusion protein of claim 1 0, wherein the polyhydroxy alkanoate synthase protein 
subunit is a PHAC1 subunit. 



12. The fusion protein of claim 11, wherein the polyhydroxyalkanoate synthase protein 
subunit is at least 70% identical to SEQ ID NO:2. 



13. The fusion protein of claim 12, wherein the polyhydroxyalkanoate synthase protein 
10 subunit is at least 80% identical to SEQ ID NO:2. 

14. The fusion protein of claim 13, wherein the polyhydroxyalkanoate synthase protein 
subunit is at least 90% identical to SEQ ID NO:2. 



15. The fusion protein of claim 14, wherein the polyhydroxyalkanoate synthase protein 
is subunit is SEQ ID NO:2. 



16. The fusion protein of claim 10, wherein the polyhydroxyalkanoate synthase protein 
subunit is a PHAC2 subunit. 



20 17. The fusion protein of claim 16, wherein the polyhydroxyalkanoate synthase protein 

subunit is at least 70% identical to SEQ ID NO:4. 



18. The fusion protein of claim 17, wherein the polyhydroxyalkanoate synthase protein 
subunit is at least 80% identical to SEQ ID NO:4. 



19, The fusion protein of claim 18, wherein the polyhydroxyalkanoate synthase protein 
subunit is at least 90% identical to SEQ ID NO:4. 



20. The fusion protein of claim 19, wherein the polyhydroxyalkanoate synthase protein 
30 subunit is SEQ ID NO:4. 
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21. The fusion protein of claim 1, wherein the polyhydroxyalkanoate synthase protein 
subunit is at least 70% identical to SEQ ID NO: 18 or SEQ ID NO:20. 

22. The fusion protein of claim 21, wherein the polyhydroxyalkanoate synthase protein 
subunit is at least 80% identical to SEQ ID NO: 1 8 or SEQ ID NO:20. 

23. The fusion protein of claim 22, wherein the polyhydroxyalkanoate synthase protein 
subunit is at least 90% identical to SEQ ID NO: 18 or SEQ ID NO:20. 

24. The fusion protein of claim 23, comprising SEQ ID NO: 18 or SEQ ID NO:20. 

25. A nucleic acid segment encoding a non-naturally occurring fusion protein, the 
nucleic acid segment comprising: 

a nucleic acid sequence encoding a peroxisome targeting protein subunit; and 

a nucleic acid sequence encoding a polyhydroxyalkanoate synthase protein subunit. 

26. The nucleic acid segment of claim 25, wherein the nucleic acid sequence encoding a 
peroxisome targeting protein subunit comprises at least a 6 contiguous nucleic acid 
sequence from SEQ ID NO: 13. 

27. The nucleic acid segment of claim 25, wherein the nucleic acid sequence encoding a 
peroxisome targeting protein subunit is at least 70% identical to SEQ ID NO:13. 

28. The nucleic acid segment of claim 27, wherein the nucleic acid sequence encoding a 
peroxisome targeting protein subunit is at least 80% identical to SEQ ID NO: 13. 

29. The nucleic acid segment of claim 28, wherein the nucleic acid sequence encoding a 
peroxisome targeting protein subunit is at least 90% identical to SEQ ID NO: 13.' 

30. The nucleic acid segment of claim 29, wherein the nucleic acid sequence encoding a 
peroxisome targeting protein subunit is SEQ ID NO: 13. 
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3 1 . The nucleic acid segment of claim 25, wherein the nucleic acid sequence encoding a 
peroxisome targeting protein subunit hybridizes to SEQ ID NO: 13. 



32. The nucleic acid segment of claim 25, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit comprises at least a 6 contiguous 
nucleic acid sequence from: 
SEQIDNO:l; 
SEQIDNO:3; 
SEQ ID NO: 15; or 
SEQIDNO-.16. 



33. The nucleic acid segment of claim 25, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit is at least 70% identical to: 
SEQIDNO:l; 

SEQIDNO:3; 
SEQ ID NO: 15; or 
SEQ ID NO: 16. 

34. The nucleic acid segment of claim 33, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit is at least 80% identical to: 
SEQIDNO:l; 

SEQ ID NO:3; 
SEQ ID NO: 15; or 
SEQ ID NO: 16. 

35. The nucleic acid segment of claim 34, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit is at least 90% identical to: 

SEQ ID NO: 1; 
SEQ ID NO:3; 
SEQ ID NO: 15; or 
SEQ ID NO: 16. 
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36. The nucleic acid segment of claim 35, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit is: 

SEQ ID NO:l; 
SEQ ID NO:3; 
s SEQ ID NO: 15; or 

SEQ ID NO: 16. 

37. The nucleic acid segment of claim 36, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit is: 

SEQ ID NO: 15; or 
10 SEQ ID NO: 16. 



38. The nucleic acid segment of claim 25, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit hybridizes to: 

SEQIDNO:l; 
is SEQIDNO:3; 

SEQ ID NO: 15; or 
SEQIDNO:16. 



39. The nucleic acid segment of claim 25, wherein the peroxisome targeting protein 
20 subunit is PTS2. 



40. The nucleic acid segment of claim 25, wherein the peroxisome targeting protein 
subunit comprises a tripeptide, the tripeptide having: 

a first amino acid in the N-terminus to C-terminus direction being S, A, or P; 
25 a second amino acid in the N-terminus to C-terminus direction being K, R, S, or H; 

and 

a third amino acid in the N-terminus to C-terminus direction being L, M, I, or F. 



41. The nucleic acid segment of claim 40, wherein the peroxisome targeting subunit 
30 comprises ARM, SRM, SKL, ARL, SRL, PSI, or PRM. 
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42. The nucleic acid segment of claim 25, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit encodes at least a 5 contiguous 
amino acid sequence from: 
SEQIDNO:2;or 
5 SEQIDNO:4. 



43. The nucleic acid segment of claim 25, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit encodes an amino acid sequence at 
least about 70% identical to: 

10 SEQIDNO:2;or 

SEQIDNO:4. 

44. The nucleic acid segment of claim 43, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit encodes an amino acid sequence at 

is least about 80% identical to: 

SEQ ID NO:2; or 
SEQ ID NO:4. 



45. The nucleic acid segment of claim 44, wherein the nucleic acid sequence encoding a 
20 polyhydroxyalkanoate synthase protein subunit encodes an amino acid sequence at 

} least about 90% identical to: 

SEQ ID NO:2; or 
SEQ ID NO:4. 



25 46. The nucleic acid segment of claim 45, wherein the nucleic acid sequence encoding a 

polyhydroxyalkanoate synthase protein subunit encodes: 
SEQ ID NO:2; or 
SEQ ID NO:4. 

30 47. A recombinant vector comprising in the 5' to 3' direction: 
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a) a promoter that directs transcription of a structural nucleic acid sequence 
encoding a non-naturally occurring fusion protein, wherein the fusion protein 
comprises: 

i) a peroxisome targeting protein subunit; and 

ii) a polyhydroxyalkanoate synthase protein subunit. 

b) a structural nucleic acid sequence encoding a non-naturally occurring fusion 
protein, wherein the fusion protein comprises: 

i) a peroxisome targeting protein subunit; and 

ii) a polyhydroxyalkanoate synthase protein subunit; and 

c) a 3' transcription terminator. 

48. The recombinant vector of claim 47, further comprising a 3' polyadenylation signal 
sequence that directs the addition of poly adenylate nucleotides to the 3* end of RNA 
transcribed from the structural nucleic acid coding sequence. 

i - 

49. The recombinant vector of claim 47, further comprising a selectable marker. 

50. The recombinant vector of claim 49, wherein the selectable marker is a kanamycin 
resistance marker, a hygromycin resistance marker, or a herbicide resistance marker. 

51 . The recombinant vector of claim 47, wherein the promoter is constitutive. 1 

52. The recombinant vector of claim 51, wherein the promoter is CaMV35S, enhanced 
CaMV35S, FMV, mas, nos, or ocs. 

53. The recombinant vector of claim 47, wherein the promoter is inducible. 

54. The recombinant vector of claim 53, wherein the promoter is tac, salicylic acid 
induced, polyacrylic acid induced, safener induced, heat shock promoter, nitrate 
induced, hormone induced, or light induced. 

55. The recombinant vector of claim 47, wherein the promoter is tissue specific. 
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56. The recombinant vector of claim 55, wherein the promoter is the p-conglycinin 7S 
promoter, napin promoter, phaseolin promoter, zein promoter, soybean trypsin 
inhibitor promoter, ACP promoter, stearoyl-ACP desaturase promoter, or oleosin 
promoter. 

57. The recombinant vector of claim 47, wherein the nucleic acid sequence encoding a 
peroxisome targeting protein subunit comprises at least a 6 contiguous nucleic acid 
sequence from SEQ ID NO : 1 3 . 

58. The recombinant vector of claim 47, wherein the nucleic acid sequence encoding a 
peroxisome targeting protein subunit is at least 70% identical to SEQ ID NO: 13. 

59. The recombinant vector of claim 58, wherein the nucleic acid sequence encoding a 
peroxisome targeting protein subunit is at least 80% identical to SEQ ID NO: 13. 

60. The recombinant vector of claim 59, wherein the nucleic acid sequence encoding a 
peroxisome targeting protein subunit is at least 90% identical to SEQ ID NO: 13. 

61. The recombinant vector of claim 60, wherein the nucleic acid sequence encoding a 
peroxisome targeting protein subunit is SEQ ID NO: 13. 

62. The recombinant vector of claim 47, wherein the nucleic acid sequence encoding a 
peroxisome targeting protein subunit hybridizes to SEQ ID NO: 13. 

63. The recombinant vector of claim 47, wherein the peroxisome targeting protein 
subunit is PTS2. 

64. The recombinant vector of claim 47, wherein the peroxisome targeting protein 
subunit comprises a tripeptide, the tripeptide having: 

a first amino acid in the N-terminus to C-terminus direction being S, A, or P; 
a second amino acid in the N-terminus to C-terminus direction being K, R, S, or H; 
and 

a third amino acid in the N-terminus to C-terminus direction being L, M, I, or F. 
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The recombinant vector of claim 64, wherein the peroxisome targeting subunit 
comprises ARM, SRM, SKL, ARL, SRL, PSI, or PRM. 



The recombinant vector of claim 47, wherein the polyhydroxyalkanoate synthase 
protein subunit is a Pseudomonas subunit. 

The recombinant vector of claim 66, wherein the Pseudomonas subunit is a 
Pseudomonas aeruginosa subunit. 

The recombinant vector of claim 47, wherein the nucleic acid sequence encoding a ( 

polyhydroxyalkanoate synthase protein subunit comprises at least a 6 contiguous 

nucleic acid sequence from: 

SEQ ID NO: 1 ; 

SEQIDNO:3; 

SEQ ID NO: 15; or 

SEQIDNO:16. 

The recombinant vector of claim 47, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit is at least 70% identical to: 
SEQ ID NO: 1; 

SEQIDNO:3; 1 

SEQ ID NO:15; or 

SEQIDNO:16. 

The recombinant vector of claim 69, wherein the nucleic acid sequence encoding a 

polyhydroxyalkanoate synthase protein subunit is at least 80% identical to: 

SEQIDNO:l; 

SEQIDNO:3; 

SEQ ID NO:15; or 

SEQIDNO:16. 
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71. The recombinant vector of claim 70, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit is at least 90% identical to: 

SEQ ID NO: 1 ; 
SEQIDNO:3; 
SEQ ID NO: 15; or 
SEQ ID NO: 16. 

72. The recombinant vector of claim 71, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit is: 

SEQIDNO:l; 
SEQIDNO:3; 
SEQ ID NO: 15; or 
SEQ ID NO: 16. 

73. The recombinant vector of claim 72, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit is: 

SEQ ID NO: 15; or 
SEQIDNO:16. 

74. The recombinant vector of claim 47, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit hybridizes to: 

SEQIDNO:l; 
SEQIDNO:3; 
SEQ ID NO: 15; or 
SEQ ID NO: 16. 

75. The recombinant vector of claim 47, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit encodes at least a 5 contiguous 
amino acid sequence from: 

SEQ ID NO:2; or 
SEQ ID NO:4. 
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76. The recombinant vector of claim 47, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit encodes an amino acid sequence at 
least about 70% identical to: 
SEQ ID NO:2; or 
SEQ ID NO:4. 



The recombinant vector of claim 76, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit encodes an amino acid sequence at 
least about 80% identical to: 
SEQ ID NO:2; or 
SEQ ID NO:4. 

The recombinant vector of claim 77, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit encodes an amino acid sequence at 
least about 90% identical to: 
SEQ ID NO:2; or 
SEQ ID NO:4. 

The recombinant vector of claim 78, wherein the nucleic acid sequence encoding a 
polyhydroxyalkanoate synthase protein subunit encodes: 
SEQIDNO:2;or 
SEQ ID NO:4. 

The recombinant vector of claim 47, wherein the structural nucleic acid sequence 

comprises: 

SEQ ID NO: 17; or 

SEQ ID NO: 19. 

The recombinant vector of claim 47, wherein the structural nucleic acid sequence 
encodes: 

SEQ ID NO: 18; or 
SEQ ID NO:20. 
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82. A recombinant host cell comprising a nucleic acid segment encoding a non-naturally 
occurring fusion protein, wherein the nucleic acid segment comprises: 

a nucleic acid sequence encoding a peroxisome targeting protein subunit; and 

a nucleic acid sequence encoding a polyhydroxyalkanoate synthase protein subunit. 

83. The recombinant host cell of claim 82, wherein the recombinant host cell is a fungal 
cell. 

84. The recombinant host cell of claim 83, wherein the fungal cell is a 
Schizosaccharomyces pombe, Streptomyces rimofaciens, Fusarium y Aspergillus 
niger, or Saccharomyces cerevisiae cell. 

85. The recombinant host cell of claim 82, wherein the recombinant host cell is a plant 
cell. 

86. The recombinant host cell of claim 85, wherein the plant cell is an alfalfa, banana, 
barley, bean, cabbage, canola/oilseed rape, carrot, castorbean, celery, clover, 
coconut, corn, cotton, cucumber, linseed, melon, olive, palm, parsnip, pea, peanut, 
pepper, potato, potato, radish, rapeseed, rice, soybean, spinach, sunflower, tobacco, 
tomato, or wheat cell. 

87. The recombinant host cell of claim 82, further comprising a nucleic acid segment 
encoding an acyl-ACP thioesterase. 

88. The recombinant host cell of claim 82, further comprising a nucleic acid segment 
encoding a fatty acyl hydroxylase. 

89. The recombinant host cell of claim 82, further comprising a nucleic acid segment 
encoding a yeast multifunctional protein (MFP). 
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90. The recombinant host cell of claim 82, further comprising a nucleic acid segment 
encoding a hydroxyacyl-CoA epimerase. 



91. A genetically transformed plant cell comprising in the 5* to 3' direction: 
5 a) a promoter to direct transcription of a structural nucleic acid sequence 

encoding a non-naturally occurring fusion protein, wherein the structural 
nucleic acid sequence comprises: 

i) a nucleic acid sequence encoding a peroxisome targeting protein 
subunit; and 

io ii) a nucleic acid sequence encoding a polyhydroxyalkanoate synthase j 

protein subunit; 

b) a structural nucleic acid sequence encoding a non-naturally occurring fusion 
protein, wherein the structural nucleic acid sequence comprises: 

i) a nucleic acid sequence encoding a peroxisome targeting protein 
is subunit; and 

ii) a nucleic acid sequence encoding a polyhydroxyalkanoate synthase 
protein subunit; 

c) a 3 * transcription terminator sequence; and 

d) a 3* polyadenylation signal sequence that directs the addition of 
20 poly adenylate nucleotides to the 3' end of RNA transcribed from the 

structural nucleic acid coding sequence. ' 



92. The genetically transformed plant cell of claim 91, wherein the plant cell is an 
alfalfa, banana, barley, bean, cabbage, canola/oilseed rape, carrot, castorbean, celery, 
25 clover, coconut, corn, cotton, cucumber, linseed, melon, olive, palm, parsnip, pea, 

peanut, pepper, potato, potato, radish, rapeseed, rice, soybean, spinach, sunflower, 
tobacco, tomato, or wheat cell. 



93. The genetically transformed plant cell of claim 91, further comprising a nucleic acid 
30 segment encoding an acyl-ACP thioesterase. 
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94. The genetically transformed plant cell of claim 91, further comprising a nucleic acid 
segment encoding a fatty acyl hydroxylase. 



95. The genetically transformed plant cell of claim 91, further comprising a nucleic acid 
segment encoding a yeast multifunctional protein (MFP). 

96. The genetically transformed plant cell of claim 91, further comprising a nucleic acid 
segment encoding a hydrbxyacyl-CoA epimerase. 

97. A genetically transformed plant comprising in the 5' to 3* direction: 

a) a promoter to direct transcription of a structural nucleic acid sequence 
encoding a non-naturally occurring fusion protein, wherein the structural 
nucleic acid sequence comprises: 

i) a nucleic acid sequence encoding a peroxisome targeting protein 
subunit; and 

ii) a nucleic acid sequence encoding a polyhydroxyalkanoate synthase 
protein subunit; 

b) a structural nucleic acid sequence encoding a non-naturally occurring fusion 
protein, wherein the structural nucleic acid sequence comprises: 

i) a nucleic acid sequence encoding a peroxisome targeting protein 
subunit; and 

ii) a nucleic acid sequence encoding a polyhydroxyalkanoate synthase 
protein subunit; 

c) a 3' transcription terminator sequence; and 

d) a 3* polyadenylation signal sequence that directs the addition of 
poly adenylate nucleotides to the 3' end of RNA transcribed from the 
structural nucleic acid coding sequence. 

98. The genetically transformed plant of claim 97, wherein the plant is an an alfalfa, 
banana, barley, bean, cabbage, canola/oilseed rape, carrot, castorbean, celery, clover, 
coconut, com, cotton, cucumber, linseed, melon, olive, palm, parsnip, pea, peanut, 
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pepper, potato, potato, radish, rapeseed, rice, soybean, spinach, sunflower, tobacco, 
tomato, or wheat plant. 



99. The genetically transformed plant of claim 97, wherein the promoter is constitutive. 

100. The genetically transformed plant of claim 99, wherein the promoter is CaMV35S, 
enhanced CaMV35S, FMV, mas, nos, or ocs. 

101 . The genetically transformed plant of claim 97, wherein the promoter is inducible. 

102. The genetically transformed plant of claim 101, wherein the promoter is tac, 
salicylic acid induced, polyacrylic acid induced, safener induced, heat shock 
promoter, nitrate induced, hormone induced, or light induced. 

103. The genetically transformed plant of claim 97, wherein the promoter is tissue 
specific. 

104. The genetically transformed plant of claim 103, wherein the promoter is the (1- 
conglycinin 7S promoter, napin promoter, phaseolin promoter, zein promoter, 
soybean trypsin inhibitor promoter, ACP promoter, stearoyl-ACP desaturase 
promoter, or oleosin promoter. 

105. The genetically transformed plant of claim 97, further comprising a nucleic acid 
segment encoding an acyl-ACP thioesterase. 

106. The genetically transformed plant of claim 97, further comprising a nucleic acid 
segment encoding a fatty acyl hydroxylase. 

107. The genetically transformed plant of claim 97, further comprising a nucleic acid 
segment encoding a yeast multifunctional protein (MFP). 
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108. The genetically transformed plant of claim 97, further comprising a nucleic acid 
segment encoding a hydroxyacyl-CoA epimerase. 

109. A method of preparing host cells useful to produce a non-naturally occurring fusion 
protein comprising the steps of: 

a) selecting a host cell 

b) transforming the selected host cell with a recombinant vector having a 
structural nucleic acid sequence encoding a non-naturally occurring fusion 
protein, wherein the structural nucleic acid sequence comprises: 

i) a nucleic acid sequence encoding a peroxisome targeting protein 
subunit; and 

ii) a nucleic acid sequence encoding a polyhydroxyalkanoate synthase 
protein subunit; and 

c) obtaining transformed host cells. 

110. The method of claim 1 09, wherein the vector further comprises a selectable marker. 

111. The method of claim 110, wherein the selectable marker is a kanamycin resistance 
marker, a hygromycin resistance marker, or a herbicide resistance marker. 

1 12. The method of claim 109, wherein the host cell is a fungal cell. 

113. The method of claim 112, wherein the fungal cell is a Schizosaccharomyces pombe, 
Streptomyces rimofaciens, Fusarium, Aspergillus niger, or Saccharomyces 
cerevisiae cell. 

114. The method of claim 1 09, wherein the host cell is a plant cell. 

115. The method of claim 1 14, wherein the plant cell is an alfalfa, banana, barley, bean, 
cabbage, canola/oilseed rape, carrot, castorbean, celery, clover, coconut, corn, 
cotton, cucumber, linseed, melon, olive, palm, parsnip, pea, peanut, pepper, potato, 
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potato, radish, rapeseed, rice, soybean, spinach, sunflower, tobacco, tomato, or 
wheat cell. 



116. A method of preparing a transformed plant useful to produce a ndn-naturally 
occurring fusion protein comprising the steps of: 

a) selecting a host plant cell 

b) transforming the selected host plant cell with a recombinant vector having a 
structural nucleic acid sequence encoding a non-naturally occurring fusion 
protein, wherein the structural nucleic acid sequence comprises: 

i) a nucleic acid sequence encoding a peroxisome targeting protein 
subunit; and 

ii) a nucleic acid sequence encoding a polyhydroxyalkanoate synthase 
protein subunit; 

c) obtaining transformed host plant cells; and 

d) regenerating the transformed host plant cells. 

117. The method of claim 116, wherein the vector further comprises a selectable marker. 

118. The method of claim 117, wherein the selectable marker is a kanamycin resistance 
marker, a hygromycin resistance marker, or a herbicide resistance marker. 

119. The method of claim 116, wherein the host plant cell is an an alfalfa, banana, barley, 
bean, cabbage, canola/oilseed rape, carrot, castorbean, celery, clover, coconut, com, 
cotton, cucumber, linseed, melon, olive, palm, parsnip, pea, peanut, pepper, potato, 
potato, radish, rapeseed, rice, soybean, spinach, sunflower, tobacco, tomato, or 
wheat cell. 

120. The plant produced by the method of claim 116. 
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121. A method for the preparation of a polyhydroxyalkanoate, comprising the steps of: 

a) obtaining a cell capable of producing a non-naturally occurring fusion 
protein, wherein the fusion protein comprises: 

i) a peroxisome targeting protein subunit; and 

ii) a polyhydroxyalkanoate synthase protein subunit; 

b) establishing a culture of the cell; and 

c) culturing the cell under conditions suitable for the production of the 
polyhydroxyalkanoate. 

122. The method of claim 121, wherein the culture contains natural fatty acids, non- 
natural fatty acids, or mixtures thereof. 

123. The method of claim 121, wherein the cell is a fungal cell. 

124. The method of claim 123, wherein the fungal cell is a Schizosaccharomyces pombe, 
Streptomyces rimofaciens, Fusarium, Aspergillus niger 9 or Saccharomyces 
cerevisiae cell. 

125. The method of claim 121 , wherein the cell is a plant cell. 

126. The method of claim 125, wherein the cell is an an alfalfa, banana, barley, bean, 
cabbage, canola/oilseed rape, carrot, castorbean, celery, clover, coconut, com, 
cotton, cucumber, linseed, melon, olive, palm, parsnip, pea, peanut, pepper, potato, 
potato, radish, rapeseed, rice, soybean, spinach, sunflower, tobacco, tomato, or 
wheat cell. 

127. The method of claim 121, wherein the polyhydroxyalkanoate comprises 3- 
hydroxyhexanoic acid (H:6), 3-hydroxyoctanoic acid (H: 8), 3-hydroxydecanoic acid 
(H:10), 3-hydroxydodecanoic acid (H:12), 3-hydroxytetradecanoic acid (H:14), 3- 
hydroxyhexadecanoic acid (H:16), 3-hydroxyheptanoic acid (H:7), 3- 
hydroxynonanoic acid (H9), 3-hydroxyundecanoic acid (H:ll), 3- 
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hydroxytridecanoic acid (H:13), 3 -hydroxy hexadecatrienoic acid (HI 6:3), 3- 
hydroxyhexadecadienoic acid (HI 6:2), 3-hydroxyhexadecenoic acid (HI 6:1), 3- - 
hydroxytetradecatrienoic acid (HI 4:3), 3-hydroxytetradecadienoic acid (HI 4:2), 3- 
hydroxytetradecenoic acid (HI 4:1), 3 -hydroxy dodecadienoic acid (HI 2:2), 3- 
hydroxydodecenoic acid (H12:l), 3 -hydroxy octenoic acid (H8:l), 4- 
hydroxydecanoic acid, 8-methyl-3-hydroxynonanoic acid, or 6-methyl-3- 
hydroxyheptanoic acid monomers. 

128. A method for the preparation of a polyhydroxyalkanoate, comprising the steps of: 

a) obtaining a plant capable of producing a non-naturally occurring fusion ^ 
protein, wherein the fusion protein comprises: 

i) a peroxisome targeting protein subunit; and 

ii) a polyhydroxyalkanoate synthase protein subunit; and 

b) growing the plant under conditions suitable for the production of the 
polyhydroxyalkanoate. 

129. The method of claim 128, further comprising supplementing the plant with natural 
fatty acids, non-natural fatty acids, or mixtures thereof. 

130. The method of claim 128, wherein the plant is an alfalfa, banana, barley, bean, 
cabbage, canola/oilseed rape, carrot, castorbean, celery, clover, coconut, com, I 
cotton, cucumber, linseed, melon, olive, palm, parsnip, pea, peanut, pepper, potato, 
potato, radish, rapeseed, rice, soybean, spinach, sunflower, tobacco, tomato, or 
wheat plant. 

131. The method of claim 128, wherein the polyhydroxyalkanoate comprises 3- 
hydroxyhexanoic acid (H:6), 3 -hydroxy octanoic acid (H:8), 3-hydroxydecanoic acid 
(H:10), 3-hydroxydodecanoic acid (H:12), 3-hydroxytetradecanoic acid (H:14), 3- 
hydroxyhexadecanoic acid (H:16), 3 -hydroxy heptanoic acid (H:7), 3- 
hydroxynonanoic acid (H9), 3-hydroxyundecanoic acid (H:ll), 3- 
hydroxytridecanoic acid (H:13), 3-hydroxy hexadecatrienoic acid (HI 6:3), 3- 
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hydroxyhexadecadienoic acid (HI 6:2), 3-hydroxyhexadecenoic acid (HI 6:1), 3- 
hydroxytetradecatrienoic acid (HI 4:3), 3-hydroxytetradecadienoic acid (HI 4:2), 3- 
hydroxytetradecenoic acid (H14:l), 3 -hydroxy dodecadienoic acid (H12:2), 3- 
hydroxydodecenoic acid (H12:l), 3-hydroxyoctenoic acid (H8:l), 4- 
hydroxydecanoic acid, 8-methy 1-3 -hydroxy nonanoic acid, or 6-methyl-3- 
hydroxyheptanoic acid monomers. 

132. A plant containing a polyhydroxyalkanoate, wherein the polyhydroxyalkanoate 
comprises 3-hydroxyhexanoic acid (H:6), 3 -hydroxy octanoic acid (H:8), 3- 
hydroxydecanoic acid (H:10), 3 -hydroxy dodecanoic acid (H:12), 3- 
hydroxytetradecanoic acid (H:14), 3 -hydroxy hexadecanoic acid (H:16), 3- 
hydroxyheptanoic acid (H:7), 3-hydroxynonanoic acid (H9), 3-hydroxyundecanoic 
acid (H:ll), 3-hydroxytridecanoic acid (H:13), 3 -hydroxy hexadecatrienoic acid 
(H16:3), 3 -hydroxyhexadecadienoic acid (H16:2), 3-hydroxyhexadecenoic acid 
(HI 6:1), 3-hydroxytetradecatrienoic acid (HI 4:3), 3-hydroxytetradecadienoic acid 
(H14:2), 3-hydroxytetradecenoic acid (H14:l), 3-hydroxydodecadienoic acid 
(H12:2), 3 -hydroxydodecenoic acid (H12:l), 3-hydroxyoctenoic acid (H8:l), 4- 
hydroxydecanoic acid, 8-methyl-3-hydroxynonanoic acid, or 6-methyl-3- 
hydroxyheptanoic acid monomers. 

133. A polyhydroxyalkanoate comprising 3-hydroxyhexadecatrienoic acid (HI 6:3), 3- 
hydroxyhexadecadienoic acid (HI 6:2), 3-hydroxytetradecatrienoic acid (HI 4:3), or 
3-hydroxydodecadienoic acid (H12:2) monomers. 
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